NASM Segmentation Fault ( strchrnul ) - gdb

Need help in a nasm code. Have to find if intgr1 mod intgr2==0, but cant use DIV.
I am getting a segmentation fault. From gdb I found:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7aacd2a in strchrnul () from /lib/x86_64-linux-gnu/libc.so.6
My program:
;nasm -f elf64 main.nasm
;gcc -o main main.o -lc
section .text
global main
extern scanf
extern printf
section .data
request1: db "Dividendo: ", 0
request2: db "Divisor: ", 0
message1: db "Eh divisivel", 0
message2: db "Nao eh divisivel", 0
formatin: db "%d", 0
intgr1: times 4 db 0 ; 32-bits integer = 4 bytes
intgr2: times 4 db 0 ;
main:
push request1 ;imprime pedido dividendo
call printf
add esp, 4
push intgr1 ;scanf do dividendo
push formatin
call scanf
add esp, 8
push request2 ;imprime pedido divisor
call printf
add esp, 4
push intgr2 ;scanf do divisor
push formatin
call scanf
add esp, 8
mov eax, [intgr1]
mov ebx, [intgr2]
jmp L1
L1: cmp eax, ebx ;compara dividendo divisor
jb L2 ;se < entao vai pra l2
sub eax,ebx ;dividendo:=dividendo-divisor
jmp L1 ;vai pra L1
L2: cmp eax, 0 ;compara dividendo e 0
je L3 ;se igual vai para l3
jmp L4 ;se nao vai para l4
L3: push message1 ;imprime que eh divisivel
call printf
add esp, 4
L4:push message2 ;imprime que nao eh
call printf
add esp, 4
MOV AL, 1 ;termina o programa
MOV EBX, 0
INT 80h
Anyone have an idea of what is wrong?
Thanks.

nasm -f elf64 main.nasm
Your Assembling a 64bit app? We don't push parameters in 64bit land, but pass in registers.
Calling conventions Look at the line in the table for x86-64 it will tell you what registers Linux uses in its calling convention. RDI, RSI, RDX, RCX, R8, R9, XMM0–7
Your printf should be:
mov rdi, request1
xor rax, rax
call printf
Your printf call needs a format parameter, or you can have problems in the future, learn the correct way now, and have less problems later.
Likewise, scanf is the same:
mov rsi, intgr2
mov rdi, formatin
xor rax, rax
call scanf
Since your linking with the C Library, you need to call exit so the library can do it's cleanup.
xor rdi, rdi
call exit

Related

I have an x86-64 program that only works properly when run from the gdb debugger

I have written a primitive version of malloc in x86 assembler as an exercise. The code uses a linked list to keep track of allocated memory blocks. I decided to add a function to walk the list and print out the meta data for each block and encountered this weird problem. When I run the code using gdb it works properly but when run directly without gdb it does not. When I print out an address returned by sbrk as a hex string it only prints correctly if run from gdb. If run repeatedly without gdb it prints a different number each run. I have cut the code down to the minimum needed to illustrate the problem. I have tried everything I can think of to find the problem. I'm sure that my itoh and printstring funcions are working correctly. I have tried linking with the c library and using puts but it does the same. I tried initializing all registers to zero. I have looked for any registers altered by the call to sbrk and saved and restored them across the call. Nothing has worked. Here is the code that illustrates the problem:
global _start,itoh,printstring
section .rodata
TRUE equ 1
FALSE equ 0
NULL equ 0
LF equ 10
sys_brk equ 12
exit_ok equ 0
sys_exit equ 60
sys_write equ 1
stdout equ 1
section .data
current_brk dq 1
linefeed db LF, NULL
msg1 db 'Test should print 0x403000 from constant: ', NULL
msg2 db 'Test should print 0x403000 from sys_brk return: ', NULL
number db '--------------------', NULL
section .text
_start: mov rdi, msg1
call printstring
mov rdi, 0x403000
mov rsi, number
mov rdx, TRUE
call itoh
mov rdi, number
call printstring
mov rax, sys_brk
syscall
mov [current_brk], rax
mov rdi, msg2
call printstring
mov rdi, [current_brk]
mov rsi, number
mov rdx, TRUE
call itoh
mov rdi, number
call printstring
.exit: mov rax, sys_exit
mov rdi, exit_ok
syscall
;
; itoh - rdi intger to convert
; - rsi address of string to return result
; - rdx if true add a newline to string
; return nothing
itoh: push rcx
push rax
xor r10, r10 ; r10 counts the digits pushed onto stack
mov r9, rdx ; save newline flag in r9
mov rax, rdi ; rax is bottom half of dividend
mov rcx, 16 ; rcx is divisor
.div: xor rdx, rdx ; zero rdx, top half of 128 bit dividend
div rcx ; divide rdx:rax by rcx
push rdx ; rdx is remainder
inc r10 ; increment digit counter
cmp rax, 0 ; is quotient zero?
jne .div ; no - keep dividimg by 16 and pushing remainder
.pop: mov byte[rsi], "0"
inc rsi
mov byte[rsi], "x"
inc rsi
.p0: pop r11 ; get a digit from stack
cmp r11, 10
jl .p1
sub r11, 10
add r11, "a"
jmp .p2
.p1: add r11, "0" ; convert to ascii char
.p2: mov byte[rsi],r11b ; copy ascii digit to string buffer
dec r10 ; decrement digit count
inc rsi ; point rsi to next char position
cmp r10, 0 ; is digit counter 0
jne .p0 ; no, go get another digit from stack
cmp r9, 0
je .exit
mov byte[rsi], LF
inc rsi
.exit: mov byte[rsi], NULL ; terminate string
pop rax
pop rcx
ret
;
; printstring - rdi is address of string
; return nothing
printstring:
push rcx ; sys_write modifies rcx
push rax ; sys_write modifies rax
xor rdx, rdx ; zero rdx, char count
mov rsi, rdi ; use rsi to index into string
.countloop:
cmp byte [rsi],NULL ; end of string?
je .countdone ; yes, finished counting
inc rdx ; no, count++
inc rsi ; point to next char
jmp .countloop
.countdone:
cmp rdx, 0 ; were there any characters?
je .printdone ; no - exit
mov rax, sys_write ; write system call
mov rsi, rdi ; address of string
mov rdi, stdout ; write to stdout
syscall ; number of bytes to write is in rdx
.printdone:
pop rax
pop rcx
ret
yasm -felf64 -gdwarf2 test.asm
ld -g -otest test.o
gdb test
Type "apropos word" to search for commands related to "word"...
Reading symbols from test...
[?2004h(gdb) run
[?2004l
Starting program: /home/david/asm/test
Test should print 0x403000 from constant: 0x403000
Test should print 0x403000 from sys_brk return: 0x403000
[Inferior 1 (process 28325) exited normally]
[?2004h[?2004l
[?2004h(gdb) q
[?2004l
./test
Test should print 0x403000 from constant: 0x403000
Test should print 0x403000 from sys_brk return: 0x14cf000

gdb: how to disassemble non-struction piece of code?

Having this in nasm:
section .data
cod: db '0123456789ABCDEF'
section .text
global _start
_start:
nop
mov rax, 0x1122334455667788
mov rdi, 1
mov rdx, 1
mov rcx, 64
.loop:
push rax
sub rcx, 4
sar rax, cl
and rax, 0xf
lea rsi, [cod + rax]
mov rax, 1
push rcx
syscall
pop rcx
pop rax
test rcx, rcx
jnz .loop
mov rax, 60
xor rdi, rdi
syscall
The in gdb:
disas _start.loop
gives:
Attempt to extract a component of a value that is not a structure.
How can I disas the loop in gdb?
PS: I would also like to know, what is meant in gdb as structs. I suppose, it has nothing to do with c structs, but rather function frames? So gdb can see where the function start and ens? So in my case, it is a loop, not a function so it does not have any frames. Is that mean by the error?
EDIT:
I have tried steping in gdb:
(gdb) break *_start+1
Breakpoint 1, 0x0000000000401001 in _start ()
(gdb) n
Single stepping until exit from function _start,
which has no line number information.
And then output
1122334455667788[Inferior 1 (process 6257) exited normally]
BUT, I have not seen any instruction from <_start.loop> loop, It just exit from _start.
I do not know whether it is because of .loop nasm directive or it does not have "struct behavior", but how can I see the piece of code .loop in gdb before exiting from _start?

Mystery: casting a GNU C label pointer to a function pointer, with inline asm to put a ret in that block. Block being optimized away?

Firstly: This code is considered to be of pure fun, please do not do anything like this in production. We will not be responsible of any harm caused to you, your company or your reindeer after compiling and executing this piece of code in any environment. The code below is not safe, not portable and is plainly dangerous. Be warned. Long post below. You were warned.
Now, after the disclaimer: Let's consider the following piece of code:
#include <stdio.h>
int fun()
{
return 5;
}
typedef int(*F)(void) ;
int main(int argc, char const *argv[])
{
void *ptr = &&hi;
F f = (F)ptr;
int c = f();
printf("TT: %d\n", c);
if(c == 5) goto bye;
//else goto bye; /* <---- This is the most important line. Pay attention to it */
hi:
c = 5;
asm volatile ("movl $5, %eax");
asm volatile ("retq");
bye:
return 66;
}
For the beginning we have the function fun which I have created purely for reference to get the generated assembly code.
Then we declare a function pointer F to functions taking no parameters and returning an int.
Then we use the not so well known GCC extension https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html to get the address of a label hi, and this works in clang too. Then we do something evil, we create a function pointer F called f and initialize it to be the label above.
Then the worst of all, we actually call this function, and assign its return value to a local variable, called C and the we print it out.
The following is an if to check if the value assigned to the c is actually the one we need, and if yes go to bye so that he application exits normally, with exit code 66. If that can be considered a normal exit code.
The next line is commented out, but I can say this is the most important line in the entire application.
The piece of code after the label hi is to assign 5 to the value of c, then two lines of assembly to initialize the value of eax to 5 and to actually return from the "function" call. As mentioned, there is a reference function, fun which generates the same code.
And now we compile this application, and run it on our online platform: https://gcc.godbolt.org/z/K6z5Yc
It generates the following assembly (with -O1 turned on, and O0 gives a similar result, albeit a bit more longer):
# else goto bye is COMMENTED OUT
fun:
mov eax, 5
ret
.LC0:
.string "TT: %d\n"
main:
push rbx
mov eax, OFFSET FLAT:.L3
call rax
mov ebx, eax
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
cmp ebx, 5
je .L4
.L3:
movl $5, %eax
retq
.L4:
mov eax, 66
pop rbx
ret
The important lines are mov eax, OFFSET FLAT:.L3 where the L3 corresponds to our hi label, and the line after that: call rax which actually calls it.
And runs like:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 66
TT: 5
Now, let's revisit the most important line in the application and uncomment it.
With -O0 we get the following assembly, generated by gcc:
# else goto bye is UNCOMMENTED
# even gcc -O0 "knows" hi: is unreachable.
fun:
push rbp
mov rbp, rsp
mov eax, 5
pop rbp
ret
.LC0:
.string "TT: %d\n"
main:
push rbp
mov rbp, rsp
sub rsp, 48
mov DWORD PTR [rbp-36], edi
mov QWORD PTR [rbp-48], rsi
mov QWORD PTR [rbp-8], OFFSET FLAT:.L4
mov rax, QWORD PTR [rbp-8]
mov QWORD PTR [rbp-16], rax
mov rax, QWORD PTR [rbp-16]
call rax
mov DWORD PTR [rbp-20], eax
mov eax, DWORD PTR [rbp-20]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
cmp DWORD PTR [rbp-20], 5
nop
.L4:
mov eax, 66
leave
ret
and the following output:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 66
so, as you can see our printf was never called, the culprit is the line mov QWORD PTR [rbp-8], OFFSET FLAT:.L4 where L4 actually corresponds to our bye label.
And from what I can see from the generated assembly, not a piece of code from the part after hi was added into the generated code.
But at least the application runs and at least has some code for comparing c to 5.
On the other end, clang, with O0 generates the following nightmare, which by the way crashes:
# else goto bye is UNCOMMENTED
# clang -O0 also doesn't emit any instructions for the hi: block
fun: # #fun
push rbp
mov rbp, rsp
mov eax, 5
pop rbp
ret
main: # #main
push rbp
mov rbp, rsp
sub rsp, 48
mov dword ptr [rbp - 4], 0
mov dword ptr [rbp - 8], edi
mov qword ptr [rbp - 16], rsi
mov qword ptr [rbp - 24], 1
mov rax, qword ptr [rbp - 24]
mov qword ptr [rbp - 32], rax
call qword ptr [rbp - 32]
mov dword ptr [rbp - 36], eax
mov esi, dword ptr [rbp - 36]
movabs rdi, offset .L.str
mov al, 0
call printf
cmp dword ptr [rbp - 36], 5
jne .LBB1_2
jmp .LBB1_3
.LBB1_2:
jmp .LBB1_3
.LBB1_3:
mov eax, 66
add rsp, 48
pop rbp
ret
.L.str:
.asciz "TT: %d\n"
If we turn on some optimization, for example O1, we get from gcc:
# else goto bye is UNCOMMENTED
# gcc -O1
fun:
mov eax, 5
ret
.LC0:
.string "TT: %d\n"
main:
sub rsp, 8
mov eax, OFFSET FLAT:.L3
call rax
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
.L3:
mov eax, 66
add rsp, 8
ret
and the application crashes, which is sort of understandable. Again, the compiler had entirely removed our hi section (mov eax, OFFSET FLAT:.L3 goes tiptoe to L3 which corresponds to our bye section) and unfortunately decided that it's a good idea to increase rsp before a ret so to be sure we end up somewhere totally different where we need to be.
And clang delivers something even more dubious:
# else goto bye is UNCOMMENTED
# clang -O1
fun: # #fun
mov eax, 5
ret
main: # #main
push rax
mov eax, 1
call rax
mov edi, offset .L.str
mov esi, eax
xor eax, eax
call printf
mov eax, 66
pop rcx
ret
.L.str:
.asciz "TT: %d\n"
1 ? How on earth did clang end up with this?
To some level I understand that the compiler decided that dead code after an if where both if and else go to the same location is not needed, but here my knowledge and insight stops.
So now, dear C and C++ gurus, assembly aficionados and compiler crushers, here comes the question:
Why?
Why do you think did the compiler decide that the two labels should be considered equivalent if we have added the else branch, or why did clang put there 1, and last but not least: someone with a deep understanding of the C standard could maybe point out where this piece of code deviated so badly from normality that we ended up in this really really weird situation.
someone with a deep understanding of the C standard could maybe point out where this piece of code deviated so badly from normality that we ended up in this really really weird situation.
You think the ISO C standard has anything to say about this code? It's chock full of UB and GNU extensions, notably pointers to local labels.
Casting a label pointer to a function pointer and calling through it is obviously UB. The GCC manual doesn't say you can do that. It's also UB to goto a label in another function.
You were only able to make that work by tricking the compiler into thinking that block might be reached so it's not removed, then using GNU C Basic asm statements to emit a ret instruction there.
GCC and clang remove dead code even with optimization disabled; e.g. if(0) { ... } doesn't emit any instructions to implement the ...
Also note that the c=5 in hi: compiles with optimization fully disabled (and else goto bye commented) to asm like movl $5, -20(%rbp). i.e. using the caller's RBP to modify local variables in the stack frame of the caller. So you have a nested function.
GNU C allows you to define nested functions that can access the local vars of their parent scope. (If you liked the asm you got from your experiment, you'll love the executable trampoline of machine-code that GCC stores to the stack with mov-immediate if you take a pointer to a nested function!)
asm volatile ("movl $5, %eax"); is missing a clobber on EAX. You step on the compiler's toes which would be UB if this statement was ever reached normally, rather than as if it were a separate function.
The use-case for GNU C Basic asm (no constraints / clobbers) is instructions like cli (disable interrupts), not anything involving integer registers, and definitely not ret.
If you want to define a callable function using inline asm, you can use asm("") at global scope, or as the body of an __attribute__((naked)) function.

Segmentation fault in NASM 64bit

I am trying to output the result to the user after getting 3 inputs from scanf.
When I run my code, I am able to get the input I need. However it crashes after I collect the input and begin the calculation.
By the way, I am using Ubuntu 14.04 with g++ and NASM 64bit.
Here's how it should look:
This program is brought to you by Chris Tarazi
Welcome to Areas of Trapezoids
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
****//Crashes here with Segmentation fault (core dumped)****
The area of a trapezoid with sizes 5.799999999999999365, 2.200000000000000153,
and 6.500000000000000000 is 26.000000000000000328
Have a nice day. Enjoy your trapezoids.
C++ file:
#include <stdio.h>
#include <stdint.h>
extern "C" double ComputeArea(); // links with global in assembly
using namespace std;
int main()
{
double area;
printf("This program is brought to you by Chris Tarazi.\n");
area = ComputeArea();
printf("Have a nice day. Enjoy your trapezoids.\n");
return 0;
}
Assembly file:
extern printf ; This function will be linked later.
extern scanf
global ComputeArea ; Declare function global to link with "extern" from C++.
;---------------------------------Declare variables-------------------------------------------
segment .data
welcome: db "Welcome to the area of trapezoids.", 10, 0
input: db "Please enter one of the base numbers: ", 0
secInput: db "Please enter the other base number: ", 0
output: db "The area of a trapezoid with sizes %1.18lf, %1.18lf, and %1.18lf is %1.18lf .", 10, 0
hInput: db "Please enter the height: ", 0
inputformat: db "%lf", 0
stringformat: db "%s", 0
fourfloatformat: db "%1.18lf %1.18lf %1.18lf %1.18lf", 0
;---------------------------------Begin segment of executable code------------------------------
segment .text
ComputeArea: ; Area of trapezoid = ((a + b) / 2) * h.
push rbp ; Save a copy of the stack base pointer
mov rbp, rsp ; We do this in order to be 100% compatible with C and C++.
push rbx ; Back up rbx
push rcx ; Back up rcx
push rdx ; Back up rdx
push rsi ; Back up rsi
push rdi ; Back up rdi
push r8 ; Back up r8
push r9 ; Back up r9
push r10 ; Back up r10
push r11 ; Back up r11
push r12 ; Back up r12
push r13 ; Back up r13
push r14 ; Back up r14
push r15 ; Back up r15
pushf ; Back up rflags
;---------------------------------Output messages to user---------------------------------------
mov qword rax, 0
mov rdi, stringformat
mov rsi, welcome
call printf
mov qword rax, 0
mov rdi, stringformat
mov rsi, input
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;firstbase
call scanf
movsd xmm0, [rsp]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, secInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;secondbase
call scanf
movsd xmm1, [rsp + 4]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, hInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;height
call scanf
movsd xmm2, [rsp + 8]
pop rax
;---------------------------------Begin ComputeArea Calculation-----------------------------------
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
;---------------------------------Output result to user-------------------------------------------
mov rax, 3
mov rdi, output
call printf
First off, why on earth are you saving ALL of those registers?!? The ABI for 64 bit Linux says you only need to save rbx, rbp, and r12 - r15 if you use those registers in your function. Also, you using Assembler, there is no need to create a stack frame in 64bit land (plus you aren't even using rbp! so why create a stack frame?) The only thing that is very important is to make sure your stack is aligned on a 16 byte boundary - call pushes an 8 byte return address, so all you need in your ComputeArea function is sub rsp, 8 and add rsp, 8 right before your ret.
In your first scanf you are using rsp without adjusting it, you just overwrote something!
You do some computations here:
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
You return from the procedure here but do not pop all of those registers you just pushed!! So basically your stack pointer is all messed up! The CPU does not know what the return address is!
What you do in the prologue, must be reversed in the epilogue before you return!
Maybe, you should start simple, read in 3 floats and try to print them!
When I correct your code, this is my output:
Welcome to the area of trapezoids.
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
The area of a trapezoid with sizes 5.799999999999999822, 2.200000000000000178, and 6.500000000000000000 is 26.000000000000000000 .

What style assembly is this (intel, att...etc?) and how can I produce it?

I'm trying to produce assembly code like this (so that it works with nasm)
;hello.asm
[SECTION .text]
global _start
_start:
jmp short ender
starter:
xor eax, eax ;clean up the registers
xor ebx, ebx
xor edx, edx
xor ecx, ecx
mov al, 4 ;syscall write
mov bl, 1 ;stdout is 1
pop ecx ;get the address of the string from the stack
mov dl, 5 ;length of the string
int 0x80
xor eax, eax
mov al, 1 ;exit the shellcode
xor ebx,ebx
int 0x80
ender:
call starter ;put the address of the string on the stack
db 'hello'
First off, what assembly style is this and second, how can I produce it from a C file using a command similar to gcc -S code.c -o code.S -masm=intel
This is Intel style.
What's wrong with the commandline you wrote in the question?