I have the following program
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password){
6 char password_buffer[16];
7 int auth_flag =0;
8
9
10 strcpy(password_buffer, password);
11
12 if(strcmp(password_buffer, "brillig" ) == 0 )
13 auth_flag = 1;
14 if(strcmp(password_buffer, "outgrabe") == 0)
15 auth_flag = 1;
16
17 return auth_flag;
18 }
19
20 int main(int argc, char *argv[]){
21 if (argc<2){
22 printf("Usage: %s <password>\n", argv[0]);
23 exit(0);
24 }
25
26 if(check_authentication(argv[1])){
27 printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\n");
28 printf(" Access Granted.\n");
29 printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\n");
30 }
31 else {
32 printf("\n Access Denied. \n");
33 }
34 }
I am running it supplying 30 bytes of As through gdb... and I am setting the following breakpoints
(gdb) break 9
Breakpoint 1 at 0x80484c1: file auth_overflow2.c, line 9.
(gdb) break 16
Breakpoint 2 at 0x804850f: file auth_overflow2.c, line 16.
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
So far so good. Everything goes as it was supposed to go even till the next breakpoint
Breakpoint 1, check_authentication (password=0xbffff6d2 'A' <repeats 30 times>)
at auth_overflow2.c:10
10 strcpy(password_buffer, password);
(gdb) x/s password_buffer
0xbffff484: "\364\237\374\267\240\205\004\b\250\364\377\277\245", <incomplete sequence \352\267>
(gdb) x/x &auth_flag
0xbffff494: 0x00
Now we see the following information:
variable auth_flag is in address 0xbffff494 and variable buffer is in the address 0xbffff484. Since address of var auth_flag is greater than the address of buffer and the stack grows towards lower addresses that means that additional (overrun of the buffer) bytes in the buffer variable WILL NOT OVERWRITE auth_flag. Right ?
But gdb has a different opinion...
(gdb) cont
Continuing.
Breakpoint 2, check_authentication (
password=0xbf004141 <Address 0xbf004141 out of bounds>)
at auth_overflow2.c:17
17 return auth_flag;
(gdb) x/s password_buffer
0xbffff484: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff494: 0x41
and ...
(gdb) x/16xw &auth_flag
0xbffff494: 0x41414141 0x41414141 0x41414141 0xbf004141
0xbffff4a4: 0x00000000 0xbffff528 0xb7e8bbd6 0x00000002
0xbffff4b4: 0xbffff554 0xbffff560 0xb7fe1858 0xbffff510
0xbffff4c4: 0xffffffff 0xb7ffeff4 0x080482bc 0x00000001
We see that auth_flag was overwritten with these 0x41 (=A) although this variable was in a lower position in stack. Why this happened?
Stack growth direction has nothing to do with where the extra bytes go when you overrun a buffer. Overruns from strcpy are always going to be into higher addresses (unless overrun so far that you wrap around to address 0, which is pretty unlikely)
Objects are stored in memory from lower udresses up to higher addresses. As you can not guarantee that the length of the string refered to by parameter password is less than 16 then your code is invalid.
In fact there is no any need in the local buffer password_buffer.
The function could be written the following way
_Bool check_authentication( const char *password )
{
return ( strcmp( password, "brillig" ) == 0 || strcmp( password, "outgrabe" ) == 0 );
}
Instead of the return type _Bool you may use type int as in your function realization. In any case either 1 or 0 will be returned.
the compiler can freely reorder the stack of variables therefore in this case it's always char array before int variable. This makes the program vulnerable for stack-based buffer overflow.
In order to change the following:
(gdb) x/s password_buffer
0xbffff484: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff494: 0x41
into expected answer as below:
(gdb) x/s password_buffer
0xbffff494: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff484: 0x00
We simply add a -fstack-protector-all argument during compilation and the result will be as expected. To be vice-versa, perhaps you can use -O0 or -fno-stack-protector.
Answer from: https://stackoverflow.com/a/21215205/3205268
If you are reading in more then 15 bytes you will get that. strcpy will look for the end of the string. You could use something like strncpy to only copy a limited number of characters.
Related
I know it might be and obvious question, but I've decided to learn a little bit of low level programming. I began with c and the gdb.
First problem:
`(gdb) x/10xb $rip
0x4005a5 <main+4>: 0xb9 0x04 0x00 0x00 0x00 0xba 0x03 0x00
0x4005ad <main+12>: 0x00 0x00
(gdb) x/10xh $rip
0x4005a5 <main+4>: 0x04b9 0x0000 0xba00 0x0003 0x0000 0x02be 0x0000 0xbf00
0x4005b5 <main+20>: 0x0001 0x0000
(gdb) x/10xw $rip
0x4005a5 <main+4>: 0x000004b9 0x0003ba00 0x02be0000 0xbf000000
0x4005b5 <main+20>: 0x00000001 0xffff9fe8 0x0000b8ff 0xc35d0000
`
Question: Why when I use unit size b the next address is 0x4005ad but when I use h or w the next address is 0x4005b5?
Second problem:
`(gdb) x/4xw $rip + 0
0x4005a5 <main+4>: 0x000004b9 0x0003ba00 0x02be0000 0xbf000000
(gdb) x/4xw $rip + 1
0x4005a6 <main+5>: 0x00000004 0x000003ba 0x0002be00 0x01bf0000
(gdb) x/4xw $rip + 2
0x4005a7 <main+6>: 0xBA000000 0x00000003 0x000002be 0x0001bf00
(gdb) x/4xw $rip + 3
0x4005a8 <main+7>: 0x03BA0000 0xbe000000 0x00000002 0x000001bf
(gdb) x/4xw $rip + 4
0x4005a9 <main+8>: 0x0003BA00 0x02be0000 0xbf000000 0x00000001
(gdb) x/4xw $rip + 5
0x4005aa <main+9>: 0x000003BA 0x0002be00 0x01bf0000 0xe8000000
(gdb) x/4xw $rip + 6
0x4005ab <main+10>: 0x00000003 0x000002be 0x0001bf00 0x9fe80000
(gdb) x/4xw $rip + 7
0x4005ac <main+11>: 0xBE000000 0x00000002 0x000001bf 0xff9fe800
(gdb) x/4xw $rip + 8
0x4005ad <main+12>: 0x02BE0000 0xbf000000 0x00000001 0xffff9fe8`
Question: Why the same value is repeated(Capital letters) for example: in the first column but move to the right, like from $rip + 2 to $rip + 5 where BA is first at the beginning, then at the middle and finally at the end?
When you ask any low-level debugger to display values from memory startingat a given address it will get some number of bytes from successive locations and display them. (Each address refers to a particular byte in memory)
In your first problem you're asking it to display ten bytes and it displays each byte as a two-digit hexadecimal value, eight bytes per line, and the address goes from 0x4005a5 to (0x4005a55 + 8) or 0x4005ad
Then you ask to display ten half words, eight half words per line, and because each half word is two bytes, the address goes from 0x4005a5 to (0x4005b5 + 16) or 0x4005b5
Your second problem is a little more complicated. Remember that when you ask it to display the contents of memory starting at a location it just fetches your four words STARTING at that location. When you pick one higher address then you're mostly getting the same memory values, just shifted by one.
So why do the values in the words seem to be shifting around in the wrong direction? That has to do with the fact that you're asking for words and x86 CPUs fetch words in a somewhat unintuitive order, from least significant byte to most.
This should help:
https://en.wikipedia.org/wiki/Endianness
I would like to create a backtrace in gdb (in a script). The command bt 2 prints only the 2 innermost frames, while bt -2 prints only the 2 outermost frames.
What I'd like to do is to skip the 2 innermost frames, and show all outer frames. I've tried
up 2
bt
(and similarly up-silently, frame, select-frame), but it doesn't affect the output of bt. To be clear, I want to get rid of the first to lines in this output:
#0 0x0000003167e0f33e in waitpid () from /lib64/libpthread.so.0
#1 0x00007f2779835de8 in print_trace() () at /path/to/MyAnalysis.cxx:385
#2 0x00007f2779836ec9 in MyAnalysis::getHistHolder(std::basic_string<char, std::char_traits<char>, std::allocator<char> >) () at /path/to/MyAnalysis.cxx:409
#3 0x00007f27798374aa in MyAnalysis::execute() () at /path/to/MyAnalysis.cxx:599
#4 0x00007f2783a9670f in EL::Worker::algsExecute() () from /blah/lib/libEventLoop.so
...
Any way to do this?
Calling return twice seems to work, but then the application is left in an invalid state afterwards, so I can't use it.
Your argument to "bt" depends on current number of frames present. Probably this can also be done in gdb directly (not sure), but this python script does exactly this:
import gdb
class TopBt (gdb.Command):
""" tbt n Shows backtrace for top n frames """
def __init__ (self):
super(TopBt, self).__init__ ("tbt", gdb.COMMAND_DATA)
def framecount():
n = 0
f = gdb.newest_frame()
while f:
n = n + 1
f = f.older()
return n
def invoke (self, arg, from_tty):
top = int(arg[0])
btarg = -(TopBt.framecount() - top)
if btarg < 0:
gdb.execute("bt " + str(btarg))
TopBt()
Save this to some file (tbt.py), source it in gdb (source tbt.py). Now you have new command tbt. tbt N will print backtrace for all but top N frames.
If it's ok for the stack to be capped at some pre-determined length, you can provide an explicit long list, like this for up to 40 frames starting at frame 4:
frame apply level 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 43 -q frame
Frame numbers beyond what's available appear to be ignored.
I am a trying to receive some data from network using UDP and parse it.
Here is the code,
char recvline[1024];
int n=recvfrom(sockfd,recvline,1024,0,NULL,NULL);
for(int i=0;i<n;i++)
cout << hex <<static_cast<short int>(recvline[i])<<" ";
Printed the output,
19 ffb0 0 0 ff88 d 38 19 48 38 0 0 2 1 3 1 ff8f ff82 5 40 20 16 6 6 22 36 6 2c 0 0 0 0 0 0 0 0
But I am expecting the output like,
19 b0 0 0 88 d 38 19 48 38 0 0 2 1 3 1 8f 82 5 40 20 16 6 6 22 36 6 2c 0 0 0 0 0 0 0 0
The ff shouldn't be there on printed output.
Actually I have to parse this data based on each character,
Like,
parseCommand(recvline);
and the parse code looks,
void parseCommand( char *msg){
int commId=*(msg+1);
switch(commId){
case 0xb0 : //do some operation
break;
case 0x20 : //do another operation
break;
}
}
And while debugging I am getting commId=-80 on watch.
Note:
In Linux I am getting successful output with the code, note that I have used unsigned char instead char for the read buffer.
unsigned char recvline[1024];
int n=recvfrom(sockfd,recvline,1024,0,NULL,NULL);
Where as in Windows recvfrom() not allowing the second argument as unsigned it giving build error, so I chose char
Looks like you might be getting the correct values, but your cast to short int during printing sign-extends your char value, causing ff to be propogated to the top byte if the top bit of your char is 1 (i.e. it is negative). You should first cast it to unsigned type, then extend to int, so you need 2 casts:
cout << hex << static_cast<short int>(static_cast<uint8_t>(recvline[i]))<<" ";
I have tested this and it behaves as expected.
In response to your extension: the data read is fine, it is a matter of how you interpret it. To parse correctly you should:
uint8_t commId= static_cast<uint8_t>(*(msg+1));
switch(commId){
case 0xb0 : //do some operation
break;
case 0x20 : //do another operation
break;
}
As you store your data in a signed data type conversions/promotion to bigger data types will first sign extend the value (filling the high order bits with the value of the MSB) even if it then gets converted to unsigned datatypes.
One solution is to define recvline as uint8_t[] in the first place an cast it to char* when passing it to the recvfrom function. That way, you only have to cast it once and you are using the same code in your windows and linux version. Also uint8_t[] is (at least to me) a clear indication that you are using the array as raw memory instead of a string of some kind.
Another possibility is to simply perform a bitwise And: (recvline[i] & 0xff). Thanks to automatic integral promotion this doesn't even require a cast.
Personal Note:
It is really annoying that the C and C++ standards don't provide a separate type for raw memory (yet), but with any luck well get a byte type in a future standard revision.
The following code crashes in 64 bit system. If file name length is less than 3,
then underflow happen for the 'len'. But this program is not showing any
segmentation fault in 32 bit system. But I am getting segmentation fault in 64
bit system. Why This program is not showing any segmentation fault in 32 bit
system ?
DIR * dirp = opendir(dirPath);
struct dirent * dp;
while(dirp)
{
if((dp = readdir(dirp)) != NULL)
{
unsigned int len = strlen(dp->d_name);
//underflow happens if filename length less than 3
if((dp->d_name[len - 3] == 'j'))
}
}
You program results in undefined behaviour, as you appear to be aware of. You are attempting to access outside the bounds of the array. And undefined behaviour is just what it sounds like. The behaviour is not defined. Anything could happen.
You might get a segmentation fault one time you run, and not another time. Or you might see different behaviour under different compilers. Undefined behaviour is by its very nature unpredictable. The fact that you seemed to get away with this error in your code under one compiler does not make your code correct.
Obviously what you should do is to avoid writing programs that result in undefined behaviour.
Why This program is not showing any segmentation fault in 32 bit system ?
Look, this is slightly simplified your program:
1 int main(int argc, char *argv[])
2 {
3 char name[100];
4 unsigned int len = 3;
5 name[len-argc] = 1;
6 return 0;
7 }
So when I build it as 32-bit program gcc -m32 -g main.c -o main32 this is how under gdb the address space of a process looks:
$ gdb -q --args ./main32 1 2 3
Reading symbols from /home/main32...done.
(gdb) start
(gdb) info proc mappings
process 28330
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x110000 0x111000 0x1000 0x0 [vdso]
0x3fa000 0x418000 0x1e000 0x0 /lib/ld-2.12.so
0x418000 0x419000 0x1000 0x1d000 /lib/ld-2.12.so
0x419000 0x41a000 0x1000 0x1e000 /lib/ld-2.12.so
0x41c000 0x5a8000 0x18c000 0x0 /lib/libc-2.12.so
0x5a8000 0x5aa000 0x2000 0x18c000 /lib/libc-2.12.so
0x5aa000 0x5ab000 0x1000 0x18e000 /lib/libc-2.12.so
0x5ab000 0x5ae000 0x3000 0x0
0x8048000 0x8049000 0x1000 0x0 /home/main32
0x8049000 0x804a000 0x1000 0x0 /home/main32
0xf7fdf000 0xf7fe0000 0x1000 0x0
0xf7ffd000 0xf7ffe000 0x1000 0x0
0xfffe9000 0xffffe000 0x15000 0x0 [stack]
(gdb) p/x &(name[len-argc])
$2 = 0xffffcfab
As you can see name[3-4] (it is underflow as you say) actually points to a valid address on stack. This is why your process does not crash.
When I build the same program as 64 bit (gcc -m64 -g main.c -o main64) the address will not be valid
(gdb) info proc mappings
process 29253
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /home/main64
0x600000 0x601000 0x1000 0x0 /home/main64
0x3c40a00000 0x3c40a20000 0x20000 0x0 /lib64/ld-2.12.so
0x3c40c1f000 0x3c40c20000 0x1000 0x1f000 /lib64/ld-2.12.so
0x3c40c20000 0x3c40c21000 0x1000 0x20000 /lib64/ld-2.12.so
0x3c40c21000 0x3c40c22000 0x1000 0x0
0x3c41200000 0x3c41389000 0x189000 0x0 /lib64/libc-2.12.so
0x3c41389000 0x3c41588000 0x1ff000 0x189000 /lib64/libc-2.12.so
0x3c41588000 0x3c4158c000 0x4000 0x188000 /lib64/libc-2.12.so
0x3c4158c000 0x3c4158d000 0x1000 0x18c000 /lib64/libc-2.12.so
0x3c4158d000 0x3c41592000 0x5000 0x0
0x7ffff7fdd000 0x7ffff7fe0000 0x3000 0x0
0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x0
0x7ffff7ffe000 0x7ffff7fff000 0x1000 0x0 [vdso]
0x7ffffffea000 0x7ffffffff000 0x15000 0x0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
(gdb) p/x &name[len-argc]
$5 = 0x8000ffffde3f
One more thing. This is how assembler looks for 64-bit application:
(gdb) disassemble /m
Dump of assembler code for function main:
5 name[len-argc] = 1;
0x0000000000400472 <+22>: mov -0x74(%rbp),%edx
0x0000000000400475 <+25>: mov -0x4(%rbp),%eax
0x0000000000400478 <+28>: sub %edx,%eax
0x000000000040047a <+30>: mov %eax,%eax
=> 0x000000000040047c <+32>: movb $0x1,-0x70(%rbp,%rax,1)
This is $eax::
(gdb) p $eax
$1 = -1
But assigning use rax since you are in 64 mode. And this is the value of $rax:
(gdb) p/x $rax
$3 = 0xffffffff
So the program adds to a valid stack addres a huge positive offset and it results in invalid address.
I would like to underline that this is undefined behavior in both 32 and 64 modes. If you want to fix this undefined behavior you can read my another answer https://stackoverflow.com/a/24287919/184968.
dp->d_name[len - 3] == 'j' the len - 3 might be within your segment on this 32-bit machine and just outside your segment on the 64-bit machine. It has to do with your operating system.
We are encountering a randomly occurring segmentation fault on a C/C++ HPUX PA-RISC application RELEASE demo compiled with the HPUX PARISC compiler and linker ACC which loads a HPUX PA_RISC RELEASE shared object sl(i.e. so) compiled and linked with ACC. We do not have access to pmap or HPUX wdb. So we are using HP's proprietary debugger adb. Here is how we use use adb:
$ adb
PA-32 adb ($h help $q quiit)
adb>!cp mdMUReadWriteExample a.out
!
adb>:r
a.out: running (process 10947)
segmentation violation
stopped at 1E3C: STW r3,1416(r1)
At this point it appears the offending instruction is somehow related to the above assembly insruction. Our first question is whether 1416 is in decimal format or hexadecimal format.
Our second question is whether the program counter 1E3C is accurate and can be used to gain further information about the offending C/C++ source line of code/
Our third question is that supposing 1416 is in decimal format , then as shown below register 1($r1) contains 0x40015b90. Using hexadecimal arithmetic 1416(base 10(i.e. hex 0x588)) + 0x40015b90 equals 0x40016118. Next , we use nm to find the shared object library address / C++ mangled symbol associated with 0x40016118.
$ grep -n "4001611" /home/marc/acc3_pa_32bit/cameron_nm.txt
27808:40016118 ? static___soa_RSA_cpp_
27823:40016110 ? static___soa_cDateTime_cpp_
Next we modify our makefile to obtain the combined disassembly -- C++ source code. However, when we search all the 50 generated *.s files we cannot mysteriously find the static___soa_RSA_cpp_. Have we skipped a crucial step here ?
adb>$r
pcoqh 0 1E3F
pcoqt 0 1E43
rp 0 0xC0209793
arg0 0 1 arg1 0 7F7F04FC arg2 0 7F7F050 4 arg3 0 7F7F0540
sp 0 7F7F05D0 ret0 0 0 ret1 0 1 dp 0 40016390
r1 0 40015B90 r3 0 7F7F0000 r4 0 4001591 8 r5 0 3C
r6 0 20 r7 0 3E r8 0 7F7F091 0 r9 0 40015918
r10 0 40031918 r11 0 1E800 r12 0 4001611 8 r13 0 400266A4
r14 0 3F r15 0 3F r16 0 3D r17 0 3D
r18 0 3A r19 0 7B03B764 r20 0 0xA98D4 00 r21 0 7F7F0550
r22 0 0 r31 0 1E2B sar 0 23 sr0 0 0xA98D400
sr1 0 3848400 sr2 0 0 sr3 0 0 sr4 0 0xA98D400
In summary, we are trying to determine if it is possible to find the offending C/C++ source lines which cause this random seg fault. Using Centos Linux and valgrind --tool=memcheck we cannot find any buffer overruns. Thank you.
Good evening, I figured out how to obtain a segmentation fault stack trace with HPUX PA-RISC. 4 steps are required 1) #include "unwind.h" #include "signal.h" 2) define an extern "C" U_STACK_TRACK(int) function prototype 3) in the main function declare a SIGSEGV handler: signal(SIGSEGV,U_STACK_TRACE). 4) In the makefile, link to libcl .Regards , Frank Tzepu Chang
$ mdMUReadWriteExample
( 0) 0xc01fef60 _sigreturn [/usr/lib/libc.2]
( 1) 0xc2f27b90 _ct_7CBigNumFv_2 + 0x88 [./libmdMatchup.sl]
( 2) 0xc2f3c83c RSADecrypt_FPCcN21Pc + 0x24 [./libmdMatchup.sl]
( 3) 0xc2f314ec DecryptLicense_9mdLicenseFPCcPc + 0x44 [./libmdMatchup.sl]
( 4) 0xc2f31280 DecryptDecodeTest_9mdLicenseFPCcT1 + 0x40 [./libmdMatchup.s
l]
( 5) 0xc2f30c3c TestLicense_9mdLicenseFPCc + 0xb4 [./libmdMatchup.sl]
( 6) 0xc2d783bc SetLicenseString_12cBatchDedupeFPCc + 0x5c [./libmdMatchup.
sl]
( 7) 0xc2d6c908 SetLicenseString_13mdMUReadWriteFPCc + 0x90 [./libmdMatchup
.sl]
( 8) 0x0000376c main + 0x68 [./mdMUReadWriteExample]
( 9) 0xc01409f8 _start + 0xa0 [/usr/lib/libc.2]
(10) 0x00002008 $START$ + 0x178 [./mdMUReadWriteExample]
Segmentation fault (core dumped)