I know it might be and obvious question, but I've decided to learn a little bit of low level programming. I began with c and the gdb.
First problem:
`(gdb) x/10xb $rip
0x4005a5 <main+4>: 0xb9 0x04 0x00 0x00 0x00 0xba 0x03 0x00
0x4005ad <main+12>: 0x00 0x00
(gdb) x/10xh $rip
0x4005a5 <main+4>: 0x04b9 0x0000 0xba00 0x0003 0x0000 0x02be 0x0000 0xbf00
0x4005b5 <main+20>: 0x0001 0x0000
(gdb) x/10xw $rip
0x4005a5 <main+4>: 0x000004b9 0x0003ba00 0x02be0000 0xbf000000
0x4005b5 <main+20>: 0x00000001 0xffff9fe8 0x0000b8ff 0xc35d0000
`
Question: Why when I use unit size b the next address is 0x4005ad but when I use h or w the next address is 0x4005b5?
Second problem:
`(gdb) x/4xw $rip + 0
0x4005a5 <main+4>: 0x000004b9 0x0003ba00 0x02be0000 0xbf000000
(gdb) x/4xw $rip + 1
0x4005a6 <main+5>: 0x00000004 0x000003ba 0x0002be00 0x01bf0000
(gdb) x/4xw $rip + 2
0x4005a7 <main+6>: 0xBA000000 0x00000003 0x000002be 0x0001bf00
(gdb) x/4xw $rip + 3
0x4005a8 <main+7>: 0x03BA0000 0xbe000000 0x00000002 0x000001bf
(gdb) x/4xw $rip + 4
0x4005a9 <main+8>: 0x0003BA00 0x02be0000 0xbf000000 0x00000001
(gdb) x/4xw $rip + 5
0x4005aa <main+9>: 0x000003BA 0x0002be00 0x01bf0000 0xe8000000
(gdb) x/4xw $rip + 6
0x4005ab <main+10>: 0x00000003 0x000002be 0x0001bf00 0x9fe80000
(gdb) x/4xw $rip + 7
0x4005ac <main+11>: 0xBE000000 0x00000002 0x000001bf 0xff9fe800
(gdb) x/4xw $rip + 8
0x4005ad <main+12>: 0x02BE0000 0xbf000000 0x00000001 0xffff9fe8`
Question: Why the same value is repeated(Capital letters) for example: in the first column but move to the right, like from $rip + 2 to $rip + 5 where BA is first at the beginning, then at the middle and finally at the end?
When you ask any low-level debugger to display values from memory startingat a given address it will get some number of bytes from successive locations and display them. (Each address refers to a particular byte in memory)
In your first problem you're asking it to display ten bytes and it displays each byte as a two-digit hexadecimal value, eight bytes per line, and the address goes from 0x4005a5 to (0x4005a55 + 8) or 0x4005ad
Then you ask to display ten half words, eight half words per line, and because each half word is two bytes, the address goes from 0x4005a5 to (0x4005b5 + 16) or 0x4005b5
Your second problem is a little more complicated. Remember that when you ask it to display the contents of memory starting at a location it just fetches your four words STARTING at that location. When you pick one higher address then you're mostly getting the same memory values, just shifted by one.
So why do the values in the words seem to be shifting around in the wrong direction? That has to do with the fact that you're asking for words and x86 CPUs fetch words in a somewhat unintuitive order, from least significant byte to most.
This should help:
https://en.wikipedia.org/wiki/Endianness
Related
What is the address of _PEB_LDR_DATA from the start of PEB?
Somewhere is says One of those structures is a pointer to _PEB_LDR_DATA within offset 0x0c from the start of the PEB.
and somewhere it says PVOID64 LDR_DATA_Addr = *(PVOID64**)((BYTE*)Peb+0x018); //0x018 is the LDR relative to the PEB offset. The base address of the LDR is stored.
so i am confused.
and, what does this code mean?
dataTableEntry = *(_LDR_DATA_TABLE_ENTRY **)(*(int *)(*(int *)(in_FS_OFFSET + 0x30) + 0xc) + 0xc);
baseDllNamePtr2 = &dataTableEntry->BaseDllName;
dataTableEntry = (_LDR_DATA_TABLE_ENTRY *)(dataTableEntry->InLoadOrderLinks).Flink;
If you look at the PEB structure, you'll see the following:
typedef struct _PEB {
BYTE Reserved1[2];
BYTE BeingDebugged;
BYTE Reserved2[1];
PVOID Reserved3[2];
PPEB_LDR_DATA Ldr;
// ...
} PEB, *PPEB;
The first 3 entries (Reserved1, BeingDebugged, and Reserved2) take up 4 bytes on x86 and x64. After that, the offset calculation changes between 32-bit and 64-bit code.
In 32-bit code, pointers are 4-byte aligned. Thus, there is no padding between Reserved2 and Reserved3. With Reserved3's size being 8 bytes (2 4-byte pointers), the offset of Ldr evaluates to 2 * 1 + 1 + 1 * 1 + 0 + 2 * 4 (i.e. 12 or 0x0C).
In 64-bit code there are 2 differences: pointers are 8 bytes in size, and 8-byte aligned. The latter introduces padding between Reserved2 and Reserved3. The offset of Ldr thus evaluates to 2 * 1 + 1 + 1 * 1 + 4 + 2 * 8 (i.e. 24 or 0x18).
The following table summarizes the offsets for x86 and x64:
Field
Offset (x86)
Offset (x64)
Reserved1
0
0
BeingDebugged
2
2
Reserved2
3
3
Reserved3
4
8 (padding)
Ldr
12
24
This was given as a past question in an exam but i'm unable to understand the result that is obtained of the last 4 printf functions. I get the conversion to hexadecimal for the first 2 but i don't really see how there are characters at
ptr[0] to ptr[3]
This is the section of code that was compiled and run.
int main(int argc, char *argv[]){
typedef unsigned char byte;
unsigned int nines = 999;
byte * ptr = (byte *) &nines;
printf ("%x\n",nines);
printf ("%x\n",nines * 0x10);
printf ("%d\n",ptr[0]);
printf ("%d\n",ptr[1]);
printf ("%d\n",ptr[2]);
printf ("%d\n",ptr[3]);
return EXIT_SUCCESS;
}
and this was the corresponding output
3e7
3e70
231
3
0
0
When you do byte * ptr = (byte *) &nines; you set the address of ptr to be the same address of nines. This has a value of 999 and in hex is 0x3e7
From the problem, I am assuming that an int has 4 bytes and this is a little endian system. i.e. bytes are stored like this.
---------------------------------
| 0xe7 | 0x03 | 0x00 | 0x00 |
---------------------------------
ptr ptr+1 ptr+2 ptr+3
So when you print them out, you get the values of 231, 3, 0 and 0 (231 is equal to 0xe7)
In the little endian system, followed by intel processors and most microcontrollers today, the least significant byte is stored first and the most significant byte is stored last.
On the other hand, we have the big endian system, followed by some older Motorola controllers and power PC's. In this the most significant byte is stored first. The output in those systems would be 0, 0, 3 and 231.
This code is platform-dependent.
Given that your platform is:
Little Endian
CHAR_BIT == 8
sizeof(int) == 4
The binary representation of 999 in memory is 11100111 00000011 00000000 00000000.
Hence the decimal representation of 999 in memory is 231 3 0 0.
As a side-note, you should bring it to the attention of your instructor at school/college/university, that since this code is platform-dependent, it is a very bad example to be given as part of an exam.
If you have an exam like this, I suggest you to change lecturer as soon as possible.
The representation of unsigned int is implementation specified, it depends on your machine for its size, endianness.
Anyway, casting from a unsigned int* to char*then read it value directly should be an undefined behavior.
In little endian like x86 machine, your unsigned int of 999 is represented as:
| 0xE7 | 0x03 | 0x00 | 0x00 |
-----------------------------
ptr ptr+1 ptr+2 ptr+3
with number between | is the value in that byte. Hence, it will be printed as:
231 3 0 0
On another machine, let's say a 32 bit, Big Endian (e.g Atmel AVR32), it will be represented as:
| 0x00 | 0x00 | 0x03 | 0xE7 |
-----------------------------
ptr ptr+1 ptr+2 ptr+3
then it will print:
0 0 3 231
In another machine, let's say a 32 bit, middle endian, it will be represented as:
| 0x03 | 0xE7 | 0x00 | 0xE0 |
-----------------------------
ptr ptr+1 ptr+2 ptr+3
then it will print:
3 231 0 0
In the older machine, let's say a 16 bit little endian machine, it is represented as:
| 0xE7 | 0x03 | xx| xx |
------------------------
ptr ptr+1 ptr+2 ptr+3
with xx is unspecified value, there is another undefined behavior.
In a 64 bit big endian machine, it is represented as:
| 0x00| 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x03 | 0xE7
-----------------------------
ptr ptr+1 ptr+2 ptr+3
it will print:
0 0 0 0
That's said, there's no exact answer for exam's question. And if yes, it still invokes undefined behavior.
Further reading about Endianness, undefined behavior
This code displays the values of each individual byte of the (assumed to be 32-bit) number nines.
nines's value is 999 in decimal, 3E7 in hexadecimal, and according to the values printed, it's stored in little-endian byte order (the "least significant" byte comes first).
It's easier to see if you convert the values to hexadecimal as well:
printf ("%x\n",ptr[0]);
printf ("%x\n",ptr[1]);
printf ("%x\n",ptr[2]);
printf ("%x\n",ptr[3]);
Which will display this:
E7
3
0
0
Also, you could interpret the decimal values this way:
231 + 3*256 + 0*65536 + 0*16777216 = 999
nines is an unsigned 32bit integer on the stack (note that it is possible for int to be 64bit wide, but it does not seem to be the case here).
ptr is a pointer, which gets initialized to the address of nines. Since it is a pointer, you can use array syntax to access the value at the address pointed to. We assume it is a little endian machine, so ptr[0] is the first (least significant) byte of nines, ptr[1] is the next, etc.
231 is then the value of the least significant byte, in hex it is 0xe7
The following code crashes in 64 bit system. If file name length is less than 3,
then underflow happen for the 'len'. But this program is not showing any
segmentation fault in 32 bit system. But I am getting segmentation fault in 64
bit system. Why This program is not showing any segmentation fault in 32 bit
system ?
DIR * dirp = opendir(dirPath);
struct dirent * dp;
while(dirp)
{
if((dp = readdir(dirp)) != NULL)
{
unsigned int len = strlen(dp->d_name);
//underflow happens if filename length less than 3
if((dp->d_name[len - 3] == 'j'))
}
}
You program results in undefined behaviour, as you appear to be aware of. You are attempting to access outside the bounds of the array. And undefined behaviour is just what it sounds like. The behaviour is not defined. Anything could happen.
You might get a segmentation fault one time you run, and not another time. Or you might see different behaviour under different compilers. Undefined behaviour is by its very nature unpredictable. The fact that you seemed to get away with this error in your code under one compiler does not make your code correct.
Obviously what you should do is to avoid writing programs that result in undefined behaviour.
Why This program is not showing any segmentation fault in 32 bit system ?
Look, this is slightly simplified your program:
1 int main(int argc, char *argv[])
2 {
3 char name[100];
4 unsigned int len = 3;
5 name[len-argc] = 1;
6 return 0;
7 }
So when I build it as 32-bit program gcc -m32 -g main.c -o main32 this is how under gdb the address space of a process looks:
$ gdb -q --args ./main32 1 2 3
Reading symbols from /home/main32...done.
(gdb) start
(gdb) info proc mappings
process 28330
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x110000 0x111000 0x1000 0x0 [vdso]
0x3fa000 0x418000 0x1e000 0x0 /lib/ld-2.12.so
0x418000 0x419000 0x1000 0x1d000 /lib/ld-2.12.so
0x419000 0x41a000 0x1000 0x1e000 /lib/ld-2.12.so
0x41c000 0x5a8000 0x18c000 0x0 /lib/libc-2.12.so
0x5a8000 0x5aa000 0x2000 0x18c000 /lib/libc-2.12.so
0x5aa000 0x5ab000 0x1000 0x18e000 /lib/libc-2.12.so
0x5ab000 0x5ae000 0x3000 0x0
0x8048000 0x8049000 0x1000 0x0 /home/main32
0x8049000 0x804a000 0x1000 0x0 /home/main32
0xf7fdf000 0xf7fe0000 0x1000 0x0
0xf7ffd000 0xf7ffe000 0x1000 0x0
0xfffe9000 0xffffe000 0x15000 0x0 [stack]
(gdb) p/x &(name[len-argc])
$2 = 0xffffcfab
As you can see name[3-4] (it is underflow as you say) actually points to a valid address on stack. This is why your process does not crash.
When I build the same program as 64 bit (gcc -m64 -g main.c -o main64) the address will not be valid
(gdb) info proc mappings
process 29253
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /home/main64
0x600000 0x601000 0x1000 0x0 /home/main64
0x3c40a00000 0x3c40a20000 0x20000 0x0 /lib64/ld-2.12.so
0x3c40c1f000 0x3c40c20000 0x1000 0x1f000 /lib64/ld-2.12.so
0x3c40c20000 0x3c40c21000 0x1000 0x20000 /lib64/ld-2.12.so
0x3c40c21000 0x3c40c22000 0x1000 0x0
0x3c41200000 0x3c41389000 0x189000 0x0 /lib64/libc-2.12.so
0x3c41389000 0x3c41588000 0x1ff000 0x189000 /lib64/libc-2.12.so
0x3c41588000 0x3c4158c000 0x4000 0x188000 /lib64/libc-2.12.so
0x3c4158c000 0x3c4158d000 0x1000 0x18c000 /lib64/libc-2.12.so
0x3c4158d000 0x3c41592000 0x5000 0x0
0x7ffff7fdd000 0x7ffff7fe0000 0x3000 0x0
0x7ffff7ffd000 0x7ffff7ffe000 0x1000 0x0
0x7ffff7ffe000 0x7ffff7fff000 0x1000 0x0 [vdso]
0x7ffffffea000 0x7ffffffff000 0x15000 0x0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
(gdb) p/x &name[len-argc]
$5 = 0x8000ffffde3f
One more thing. This is how assembler looks for 64-bit application:
(gdb) disassemble /m
Dump of assembler code for function main:
5 name[len-argc] = 1;
0x0000000000400472 <+22>: mov -0x74(%rbp),%edx
0x0000000000400475 <+25>: mov -0x4(%rbp),%eax
0x0000000000400478 <+28>: sub %edx,%eax
0x000000000040047a <+30>: mov %eax,%eax
=> 0x000000000040047c <+32>: movb $0x1,-0x70(%rbp,%rax,1)
This is $eax::
(gdb) p $eax
$1 = -1
But assigning use rax since you are in 64 mode. And this is the value of $rax:
(gdb) p/x $rax
$3 = 0xffffffff
So the program adds to a valid stack addres a huge positive offset and it results in invalid address.
I would like to underline that this is undefined behavior in both 32 and 64 modes. If you want to fix this undefined behavior you can read my another answer https://stackoverflow.com/a/24287919/184968.
dp->d_name[len - 3] == 'j' the len - 3 might be within your segment on this 32-bit machine and just outside your segment on the 64-bit machine. It has to do with your operating system.
I have the following program
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 int check_authentication(char *password){
6 char password_buffer[16];
7 int auth_flag =0;
8
9
10 strcpy(password_buffer, password);
11
12 if(strcmp(password_buffer, "brillig" ) == 0 )
13 auth_flag = 1;
14 if(strcmp(password_buffer, "outgrabe") == 0)
15 auth_flag = 1;
16
17 return auth_flag;
18 }
19
20 int main(int argc, char *argv[]){
21 if (argc<2){
22 printf("Usage: %s <password>\n", argv[0]);
23 exit(0);
24 }
25
26 if(check_authentication(argv[1])){
27 printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\n");
28 printf(" Access Granted.\n");
29 printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\n");
30 }
31 else {
32 printf("\n Access Denied. \n");
33 }
34 }
I am running it supplying 30 bytes of As through gdb... and I am setting the following breakpoints
(gdb) break 9
Breakpoint 1 at 0x80484c1: file auth_overflow2.c, line 9.
(gdb) break 16
Breakpoint 2 at 0x804850f: file auth_overflow2.c, line 16.
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
So far so good. Everything goes as it was supposed to go even till the next breakpoint
Breakpoint 1, check_authentication (password=0xbffff6d2 'A' <repeats 30 times>)
at auth_overflow2.c:10
10 strcpy(password_buffer, password);
(gdb) x/s password_buffer
0xbffff484: "\364\237\374\267\240\205\004\b\250\364\377\277\245", <incomplete sequence \352\267>
(gdb) x/x &auth_flag
0xbffff494: 0x00
Now we see the following information:
variable auth_flag is in address 0xbffff494 and variable buffer is in the address 0xbffff484. Since address of var auth_flag is greater than the address of buffer and the stack grows towards lower addresses that means that additional (overrun of the buffer) bytes in the buffer variable WILL NOT OVERWRITE auth_flag. Right ?
But gdb has a different opinion...
(gdb) cont
Continuing.
Breakpoint 2, check_authentication (
password=0xbf004141 <Address 0xbf004141 out of bounds>)
at auth_overflow2.c:17
17 return auth_flag;
(gdb) x/s password_buffer
0xbffff484: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff494: 0x41
and ...
(gdb) x/16xw &auth_flag
0xbffff494: 0x41414141 0x41414141 0x41414141 0xbf004141
0xbffff4a4: 0x00000000 0xbffff528 0xb7e8bbd6 0x00000002
0xbffff4b4: 0xbffff554 0xbffff560 0xb7fe1858 0xbffff510
0xbffff4c4: 0xffffffff 0xb7ffeff4 0x080482bc 0x00000001
We see that auth_flag was overwritten with these 0x41 (=A) although this variable was in a lower position in stack. Why this happened?
Stack growth direction has nothing to do with where the extra bytes go when you overrun a buffer. Overruns from strcpy are always going to be into higher addresses (unless overrun so far that you wrap around to address 0, which is pretty unlikely)
Objects are stored in memory from lower udresses up to higher addresses. As you can not guarantee that the length of the string refered to by parameter password is less than 16 then your code is invalid.
In fact there is no any need in the local buffer password_buffer.
The function could be written the following way
_Bool check_authentication( const char *password )
{
return ( strcmp( password, "brillig" ) == 0 || strcmp( password, "outgrabe" ) == 0 );
}
Instead of the return type _Bool you may use type int as in your function realization. In any case either 1 or 0 will be returned.
the compiler can freely reorder the stack of variables therefore in this case it's always char array before int variable. This makes the program vulnerable for stack-based buffer overflow.
In order to change the following:
(gdb) x/s password_buffer
0xbffff484: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff494: 0x41
into expected answer as below:
(gdb) x/s password_buffer
0xbffff494: 'A' <repeats 30 times>
(gdb) x/x &auth_flag
0xbffff484: 0x00
We simply add a -fstack-protector-all argument during compilation and the result will be as expected. To be vice-versa, perhaps you can use -O0 or -fno-stack-protector.
Answer from: https://stackoverflow.com/a/21215205/3205268
If you are reading in more then 15 bytes you will get that. strcpy will look for the end of the string. You could use something like strncpy to only copy a limited number of characters.
x/4x xxx will check bytes at higher address than xxx.
How to check bytes at lower address ?
Just subtract the number of preceding bytes you want from xxx:
(gdb) x/4x 0x13b3da00
0x13b3da00: 0x004cc630 0x00000000 0x13af3ba0 0x00000000
(gdb) x/4x 0x13b3da00-4
0x13b3d9fc: 0x00000000 0x004cc630 0x00000000 0x13af3ba0
(gdb) x/4x 0x13b3da00-8
0x13b3d9f8: 0x00000000 0x00000000 0x004cc630 0x00000000