I am printing some information about CPU in my OS using CPUID instruction.
Reading and printing vendor string(GenuineIntel) works well, but reading brand string gives me little strange string.
ok cpu-info <= Run command
CPU Vendor name: GenuineIntel <= Vendor string is good
CPU Brand: D: l(R) Core(TMD: CPU MD: <= What..?
ok
Vendor string supposed to be:
Intel(R) Core(TM) i5 CPU M 540
But what I got is:
D: l(R) Core(TMD: CPU MD:
C++ code:
char vendorString[13] = { 0, };
Dword eax, ebx, ecx, edx;
ACpuid(0, &eax, &ebx, &ecx, &edx);
*((Dword*)vendorString) = ebx;
*((Dword*)vendorString + 1) = edx;
*((Dword*)vendorString + 2) = ecx;
Console::Output.Write(L"CPU vendor name: ");
for (int i = 0; i < 13; i++) {
Console::Output.Write((wchar_t)(vendorString[i]));
}
Console::Output.WriteLine();
char brandString[48] = { 0, };
ACpuid(0x80000002, &eax, &ebx, &ecx, &edx);
*((Dword*)brandString) = eax;
*((Dword*)brandString + 1) = ebx;
*((Dword*)brandString + 2) = ecx;
*((Dword*)brandString + 3) = edx;
ACpuid(0x80000003, &eax, &ebx, &ecx, &edx);
*((Dword*)brandString + 4) = eax;
*((Dword*)brandString + 5) = ebx;
*((Dword*)brandString + 6) = ecx;
*((Dword*)brandString + 7) = edx;
ACpuid(0x80000004, &eax, &ebx, &ecx, &edx);
*((Dword*)brandString + 8) = eax;
*((Dword*)brandString + 9) = ebx;
*((Dword*)brandString + 10) = ecx;
*((Dword*)brandString + 11) = edx;
Console::Output.Write(L"CPU brand: ");
for (int i = 0; i < 48; i++) {
Console::Output.Write((wchar_t) brandString[i]);
}
Console::Output.WriteLine();
NOTE:
This program is UEFI application. No problem with permissions.
Console is an wrapper class for EFI console. Not C# stuff.
Dword = unsigned 32bit integer
Assembly code(MASM):
;Cpuid command
;ACpuid(Type, pEax, pEbx, pEcx, pEdx)
ACpuid Proc
;Type => Rcx
;pEax => Rdx
;pEbx => R8
;pEcx => R9
;pEdx => [ rbp + 48 ] ?
push rbp
mov rbp, rsp
push rax
push rsi
mov rax, rcx
cpuid
mov [ rdx ], eax
mov [ r8 ], ebx
mov [ r9 ], ecx
mov rsi, [ rbp + 48 ]
mov [ rsi ], rdx
pop rsi
pop rax
pop rbp
ret
ACpuid Endp
I agree with Ross Ridge that you should use the compiler intrinsic __cpuid. As for why your code likely doesn't work as is - there are some bugs that will cause problems.
CPUID destroys the contents of RAX, RBX, RCX, and RDX and yet you do this in your code:
cpuid
mov [ rdx ], eax
RDX has been destroyed by the time mov [ rdx ], eax is executed, rendering the pointer in RDX invalid. You'll need to move RDX to another register before using the CPUID instruction.
Per the Windows 64-bit Calling Convention these are the volatile registers that need to be preserved by the caller:
The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile and must be considered destroyed on function calls (unless otherwise safety-provable by analysis such as whole program optimization).
These are the non-volatile ones that need to be preserved by the callee:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are considered nonvolatile and must be saved and restored by a function that uses them.
We can use R10 (a volatile register) to store RDX temporarily. Rather than use RSI in the code we can reuse R10 for updating the value at pEdx. We won't need to preserve RSI if we don't use it. CPUID does destroy RBX, and RBX is non-volatile, so we need to preserve it. RAX is volatile so we don't need to preserve it.
In your code you have this line:
mov [ rsi ], rdx
RSI is a memory address (pEdx) provided by the caller to store the value in EDX. The code you have would move the contents of the 8-byte register RDX to a memory location that was expecting a 4-byte DWORD. This could potentially trash data in the caller. This really should have been:
mov [ rsi ], edx
With all of the above in mind we could code the ACpuid routine this way:
option casemap:none
.code
;Cpuid command
;ACpuid(Type, pEax, pEbx, pEcx, pEdx)
ACpuid Proc
;Type => Rcx
;pEax => Rdx
;pEbx => R8
;pEcx => R9
;pEdx => [ rbp + 48 ] ?
push rbp
mov rbp, rsp
push rbx ; Preserve RBX (destroyed by CPUID)
mov r10, rdx ; Save RDX before CPUID
mov rax, rcx
cpuid
mov [ r10 ], eax
mov [ r8 ], ebx
mov [ r9 ], ecx
mov r10, [ rbp + 48 ]
mov [ r10 ], edx ; Last parameter is pointer to 32-bit DWORD,
; Move EDX to the memory location, not RDX
pop rbx
pop rbp
ret
ACpuid Endp
end
Related
I was thinking again about implementing the quadratic sieve for fun, which requires Guassian elimination over a binary field, that is the operations required are 1. swapping rows and 2. XORing rows.
My ideas were either to maintain a bit array using a vector of 64-bit ints and bit twiddling, or use vector<bool>, which is probably space-optimized on my system. The bit array must be able to be dynamically sized, so std::bitset won't work. The advantage of maintaining my own ints is that I can XOR 64 bits at a time which is a neat trick. I wanted to see what a compiler would do for a loop that XOR'd bool vectors: (I wasn't able to use ^=, see operator |= on std::vector<bool>)
void xor_vector(std::vector<bool>& a, std::vector<bool>& b) {
for (std::size_t i=0; i<a.size(); ++i)
a[i] = a[i] ^ b[i];
}
I have a very basic understanding of x86 but it looks like the compiler isn't actually XORing words together? Is there a way to get the compiler to XOR entire words at a time?
https://godbolt.org/z/PbGdv3sKT
xor_vector(std::vector<bool, std::allocator<bool> >&, std::vector<bool, std::allocator<bool> >&):
mov r8, QWORD PTR [rdi]
mov rax, QWORD PTR [rdi+16]
mov edx, DWORD PTR [rdi+24]
sub rax, r8
lea rdi, [rdx+rax*8]
test rdi, rdi
je .L11
push rbp
mov r10d, 1
push rbx
mov r9, QWORD PTR [rsi]
xor esi, esi
jmp .L7
.L16:
mov rdx, r10
sal rdx, cl
mov rcx, QWORD PTR [r11]
mov rbp, rdx
test rdx, rcx
setne bl
and rbp, QWORD PTR [rax]
setne bpl
.L4:
mov rax, rdx
not rdx
or rax, rcx
and rdx, rcx
cmp bpl, bl
cmovne rdx, rax
add rsi, 1
mov QWORD PTR [r11], rdx
cmp rsi, rdi
je .L15
.L7:
test rsi, rsi
lea rax, [rsi+63]
mov rdx, rsi
cmovns rax, rsi
sar rdx, 63
shr rdx, 58
sar rax, 6
lea rcx, [rsi+rdx]
sal rax, 3
and ecx, 63
lea r11, [r8+rax]
add rax, r9
sub rcx, rdx
jns .L16
add rcx, 64
mov rdx, r10
sal rdx, cl
mov rcx, QWORD PTR [r11-8]
mov rbp, rdx
test rcx, rdx
setne bl
and rbp, QWORD PTR [rax-8]
setne bpl
sub r11, 8
jmp .L4
.L15:
pop rbx
pop rbp
ret
.L11:
ret
My question is similar to bitwise operations on vector<bool> but the answers are dated and don't seem to answer my question.
Update: I tested with a 256 bit sized bitset too. Still I don't see XORing whole machine words.
void xor_vector(std::bitset<256>& a, std::bitset<256>& b) {
for (std::size_t i=0; i<a.size(); ++i)
a[i] = a[i] ^ b[i];
}
https://godbolt.org/z/jKEf89E1j
xor_vector(std::bitset<256ul>&, std::bitset<256ul>&):
push rbx
mov r8, rdi
mov r11, rsi
xor edx, edx
mov ebx, 1
.L4:
mov rsi, rdx
mov rcx, rdx
mov rax, rbx
shr rsi, 6
and ecx, 63
sal rax, cl
mov rdi, QWORD PTR [r8+rsi*8]
mov rcx, rax
and rcx, QWORD PTR [r11+rsi*8]
mov rcx, rax
setne r10b
test rax, rdi
not rax
setne r9b
or rcx, rdi
and rax, rdi
cmp r10b, r9b
cmovne rax, rcx
add rdx, 1
mov QWORD PTR [r8+rsi*8], rax
cmp rdx, 256
jne .L4
pop rbx
ret
Because of this bug in VC++ compiler I’m porting parts of my code from C++ to assembly. VC++ doesn’t support inline assembly for 64 bit targets, so I’m forced to just create an assembly file.
Why does the following empty function crashes in runtime saying “Access violation executing location 0x0000000000000000” ?
C++ portion:
extern "C"
{
void asm_proc( const void* p1, const void* p2, void* p3 );
}
int main()
{
asm_proc( nullptr, nullptr, nullptr );
return 0;
}
Assembly portion:
PUBLIC asm_proc
; Save the non-volatile XMM registers to the stack starting from specified offset. This uses 160 = A0h bytes of the stack.
; https://learn.microsoft.com/en-us/cpp/build/register-usage?view=vs-2017#register-volatility-and-preservation
SAVE_XMM_REGS macro offset
movaps [ rsp + offset ], xmm6
movaps [ rsp + offset + 10h ], xmm7
movaps [ rsp + offset + 20h ], xmm8
movaps [ rsp + offset + 30h ], xmm9
movaps [ rsp + offset + 40h ], xmm10
movaps [ rsp + offset + 50h ], xmm11
movaps [ rsp + offset + 60h ], xmm12
movaps [ rsp + offset + 70h ], xmm13
movaps [ rsp + offset + 80h ], xmm14
movaps [ rsp + offset + 90h ], xmm15
endm
; Restore XMM registers from the stack.
LOAD_XMM_REGS macro offset
movaps xmm15, [ rsp + offset + 90h ]
movaps xmm14, [ rsp + offset + 80h ]
movaps xmm13, [ rsp + offset + 70h ]
movaps xmm12, [ rsp + offset + 60h ]
movaps xmm11, [ rsp + offset + 50h ]
movaps xmm10, [ rsp + offset + 40h ]
movaps xmm8, [ rsp + offset + 20h ]
movaps xmm7, [ rsp + offset + 10h ]
movaps xmm6, [ rsp + offset ]
endm
.CODE
align(16)
asm_proc PROC FRAME
; const void* p1: rcx
; const void* p2: rdx
; void *p3: r8
; Some boilerplate copy-pasted from this repository: https://github.com/lallousx86/AsmInVs/tree/master/x64asm
; Prologue
sub rsp, 030h ; allocate stack space
.allocstack 030h ; encode that change
push rbp ; save old frame pointer
.pushreg rbp ; encode stack operation
mov rbp, rsp ; set new frame pointer
.setframe rbp, 0 ; encode frame pointer
.endprolog
; Need stack space for XMM6-15 registers, they're non-volatile
sub rsp, 0C0h
; Save non-volatile registers we use.
; https://learn.microsoft.com/en-us/cpp/build/register-usage?view=vs-2017#register-volatility-and-preservation
mov QWORD ptr [ rsp ], r12
mov QWORD ptr [ rsp + 8 ], r13
mov QWORD ptr [ rsp + 10h ], r14
mov QWORD ptr [ rsp + 18h ], r15
SAVE_XMM_REGS 20h
; Actual code goes here, stripped out.
; Restore non-volatile registers
LOAD_XMM_REGS 20h
mov r15, QWORD ptr [ rsp + 18h ]
mov r14, QWORD ptr [ rsp + 10h ]
mov r13, QWORD ptr [ rsp + 8 ]
mov r12, QWORD ptr [ rsp ]
add rsp, ( 0C0h + 030h )
pop rbp
ret
asm_proc ENDP
END
P.S. My function doesn’t call any other functions, it’s a leaf one. The actual code I’ve stripped out does use r12-15 and xmm6-15 so I required to preserve them, see the ABI.
Consider the following code, in C++:
#include <cstdlib>
std::size_t count(std::size_t n)
{
std::size_t i = 0;
while (i < n) {
asm volatile("": : :"memory");
++i;
}
return i;
}
int main(int argc, char* argv[])
{
return count(argc > 1 ? std::atoll(argv[1]) : 1);
}
It is just a loop that is incrementing its value, and returns it at the end. The asm volatile prevents the loop from being optimized away. We compile it under g++ 8.1 and clang++ 5.0 with the arguments -Wall -Wextra -std=c++11 -g -O3.
Now, if we look at what compiler explorer is producing, we have, for g++:
count(unsigned long):
mov rax, rdi
test rdi, rdi
je .L2
xor edx, edx
.L3:
add rdx, 1
cmp rax, rdx
jne .L3
.L2:
ret
main:
mov eax, 1
xor edx, edx
cmp edi, 1
jg .L25
.L21:
add rdx, 1
cmp rdx, rax
jb .L21
mov eax, edx
ret
.L25:
push rcx
mov rdi, QWORD PTR [rsi+8]
mov edx, 10
xor esi, esi
call strtoll
mov rdx, rax
test rax, rax
je .L11
xor edx, edx
.L12:
add rdx, 1
cmp rdx, rax
jb .L12
.L11:
mov eax, edx
pop rdx
ret
and for clang++:
count(unsigned long): # #count(unsigned long)
test rdi, rdi
je .LBB0_1
mov rax, rdi
.LBB0_3: # =>This Inner Loop Header: Depth=1
dec rax
jne .LBB0_3
mov rax, rdi
ret
.LBB0_1:
xor edi, edi
mov rax, rdi
ret
main: # #main
push rbx
cmp edi, 2
jl .LBB1_1
mov rdi, qword ptr [rsi + 8]
xor ebx, ebx
xor esi, esi
mov edx, 10
call strtoll
test rax, rax
jne .LBB1_3
mov eax, ebx
pop rbx
ret
.LBB1_1:
mov eax, 1
.LBB1_3:
mov rcx, rax
.LBB1_4: # =>This Inner Loop Header: Depth=1
dec rcx
jne .LBB1_4
mov rbx, rax
mov eax, ebx
pop rbx
ret
Understanding the code generated by g++, is not that complicated, the loop being:
.L3:
add rdx, 1
cmp rax, rdx
jne .L3
every iteration increments rdx, and compares it to rax that stores the size of the loop.
Now, I have no idea of what clang++ is doing. Apparently it uses dec, which is weird to me, and I don't even understand where the actual loop is. My question is the following: what is clang doing?
(I am looking for comments about the clang assembly code to describe what is done at each step and how it actually works).
The effect of the function is to return n, either by counting up to n and returning the result, or by simply returning the passed-in value of n. The clang code does the latter. The counting loop is here:
mov rax, rdi
.LBB0_3: # =>This Inner Loop Header: Depth=1
dec rax
jne .LBB0_3
mov rax, rdi
ret
It begins by copying the value of n into rax. It decrements the value in rax, and if the result is not 0, it jumps back to .LBB0_3. If the value is 0 it falls through to the next instruction, which copies the original value of n into rax and returns.
There is no i stored, but the code does the loop the prescribed number of times, and returns the value that i would have had, namely, n.
I am trying to output the result to the user after getting 3 inputs from scanf.
When I run my code, I am able to get the input I need. However it crashes after I collect the input and begin the calculation.
By the way, I am using Ubuntu 14.04 with g++ and NASM 64bit.
Here's how it should look:
This program is brought to you by Chris Tarazi
Welcome to Areas of Trapezoids
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
****//Crashes here with Segmentation fault (core dumped)****
The area of a trapezoid with sizes 5.799999999999999365, 2.200000000000000153,
and 6.500000000000000000 is 26.000000000000000328
Have a nice day. Enjoy your trapezoids.
C++ file:
#include <stdio.h>
#include <stdint.h>
extern "C" double ComputeArea(); // links with global in assembly
using namespace std;
int main()
{
double area;
printf("This program is brought to you by Chris Tarazi.\n");
area = ComputeArea();
printf("Have a nice day. Enjoy your trapezoids.\n");
return 0;
}
Assembly file:
extern printf ; This function will be linked later.
extern scanf
global ComputeArea ; Declare function global to link with "extern" from C++.
;---------------------------------Declare variables-------------------------------------------
segment .data
welcome: db "Welcome to the area of trapezoids.", 10, 0
input: db "Please enter one of the base numbers: ", 0
secInput: db "Please enter the other base number: ", 0
output: db "The area of a trapezoid with sizes %1.18lf, %1.18lf, and %1.18lf is %1.18lf .", 10, 0
hInput: db "Please enter the height: ", 0
inputformat: db "%lf", 0
stringformat: db "%s", 0
fourfloatformat: db "%1.18lf %1.18lf %1.18lf %1.18lf", 0
;---------------------------------Begin segment of executable code------------------------------
segment .text
ComputeArea: ; Area of trapezoid = ((a + b) / 2) * h.
push rbp ; Save a copy of the stack base pointer
mov rbp, rsp ; We do this in order to be 100% compatible with C and C++.
push rbx ; Back up rbx
push rcx ; Back up rcx
push rdx ; Back up rdx
push rsi ; Back up rsi
push rdi ; Back up rdi
push r8 ; Back up r8
push r9 ; Back up r9
push r10 ; Back up r10
push r11 ; Back up r11
push r12 ; Back up r12
push r13 ; Back up r13
push r14 ; Back up r14
push r15 ; Back up r15
pushf ; Back up rflags
;---------------------------------Output messages to user---------------------------------------
mov qword rax, 0
mov rdi, stringformat
mov rsi, welcome
call printf
mov qword rax, 0
mov rdi, stringformat
mov rsi, input
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;firstbase
call scanf
movsd xmm0, [rsp]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, secInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;secondbase
call scanf
movsd xmm1, [rsp + 4]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, hInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;height
call scanf
movsd xmm2, [rsp + 8]
pop rax
;---------------------------------Begin ComputeArea Calculation-----------------------------------
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
;---------------------------------Output result to user-------------------------------------------
mov rax, 3
mov rdi, output
call printf
First off, why on earth are you saving ALL of those registers?!? The ABI for 64 bit Linux says you only need to save rbx, rbp, and r12 - r15 if you use those registers in your function. Also, you using Assembler, there is no need to create a stack frame in 64bit land (plus you aren't even using rbp! so why create a stack frame?) The only thing that is very important is to make sure your stack is aligned on a 16 byte boundary - call pushes an 8 byte return address, so all you need in your ComputeArea function is sub rsp, 8 and add rsp, 8 right before your ret.
In your first scanf you are using rsp without adjusting it, you just overwrote something!
You do some computations here:
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
You return from the procedure here but do not pop all of those registers you just pushed!! So basically your stack pointer is all messed up! The CPU does not know what the return address is!
What you do in the prologue, must be reversed in the epilogue before you return!
Maybe, you should start simple, read in 3 floats and try to print them!
When I correct your code, this is my output:
Welcome to the area of trapezoids.
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
The area of a trapezoid with sizes 5.799999999999999822, 2.200000000000000178, and 6.500000000000000000 is 26.000000000000000000 .
A simple program I am working on (for Homework) requires that I take a keystroke as input and return the categories it falls under (is it a printable charater, decimal, etc..)
I'm using cmp to compare the keystroke against the values of the maximum and/or minimum values in it's category (for example if the ASCII code of the keystroke is above 0x7F then it is a printable character)
However, there is obviously something not working in my comparison since no matter what, i.e. when I use the escape button as input, it is not printing "Control Key".
Could it be that keys need some more processing before they can be compared based on ASCII value?
Here is my code
segment .data
controlKey: db "Control Key", 10
controlLen: equ $-controlKey
printableKey: db "Printable", 10
printableLen: equ $-printableKey
decimalKey: db "Decimal", 10
decimalLen: equ $-decimalKey
segment .bss
key resb 2
segment .text
global main
main:
mov eax, 3 ; system call 3 to get input
mov ebx, 0 ; standart input device
mov ecx, key ; pointer to id
mov edx, 2 ; take in this many bytes
int 0x80
control: ; check if it's a control key
mov ebx, 31 ; highest control key
mov edx, key
cmp edx, ebx
jg printable
mov eax, 4
mov ebx, 1
mov ecx, controlKey
mov edx, controlLen
int 0x80
; jmp exit ; It's obviously not any of the other categories
printable: ; Tell that it's a printable symbol
mov eax, 4
mov ebx, 1
mov ecx, printableKey
mov edx, printableLen
int 0x80
decimal:
mov ebx, 30h ; smallest decimal ASCII
mov edx, key
cmp edx, ebx
jl uppercase
mov ebx, 39h ; test against 9
cmp edx, ebx
jg uppercase
mov eax, 4
mov ebx, 1
mov ecx, decimalKey
mov edx, decimalLen
int 0x80
uppercase:
lowercase:
mov eax, 4 ; system call 4 for output
mov ebx, 1 ; standard output device
mov ecx, key ; move the content into ecx
mov edx, 1 ; tell edx how many bytes
int 0x80 ;
exit:
mov eax, 1
xor ebx, ebx
int 0x80
The Escape key won't be read by your application, since it is - most probably - caught by the terminal that your application runs in. I can see that you're using the read syscall in your code, which is, of course, fine, but you should remember that this function only provides reading from a file descriptor, which doesn't necessarily have to contain all the control signals sent from the keyboard. The file descriptor (stdin) doesn't even have to come from the keyboard, since a file might be redirected to your process as standard input.
I don't know if there's a good way of achieving (capturing keystrokes, not the data that they represent - and this is what you're doing now) what you're trying to do just with system calls in Linux. You could try using some terminal controlling library, for example ncurses or termios, but I guess that isn't a part of your assignment.
I have done this a while back, here is a sample to show how to turn character echo on/off, and canonical mode on/off. When run, when you press a key, the keycode will be displayed on the screen, the program will exit once shift+q is pressed:
terminos.asm
ICANON equ 1<<1
ECHO equ 1<<3
sys_exit equ 1
sys_read equ 3
sys_write equ 4
stdin equ 0
stdout equ 1
global _start
SECTION .bss
lpBufIn resb 2
lpBufOut resb 2
termios resb 36
section .text
_start:
call echo_off
call canonical_off
.GetCode:
call GetKeyCode
movzx esi, byte[lpBufIn]
push esi
call PrintNum
pop esi
cmp esi, 81
jne .GetCode
call echo_on
call canonical_on
mov eax, sys_exit
xor ebx, ebx
int 80H
;~ #########################################
GetKeyCode:
mov eax, sys_read
mov ebx, stdin
mov ecx, lpBufIn
mov edx, 1
int 80h
ret
;~ #########################################
canonical_off:
call read_stdin_termios
; clear canonical bit in local mode flags
mov eax, ICANON
not eax
and [termios+12], eax
call write_stdin_termios
ret
;~ #########################################
echo_off:
call read_stdin_termios
; clear echo bit in local mode flags
mov eax, ECHO
not eax
and [termios+12], eax
call write_stdin_termios
ret
;~ #########################################
canonical_on:
call read_stdin_termios
; set canonical bit in local mode flags
or dword [termios+12], ICANON
call write_stdin_termios
ret
;~ #########################################
echo_on:
call read_stdin_termios
; set echo bit in local mode flags
or dword [termios+12], ECHO
call write_stdin_termios
ret
;~ #########################################
read_stdin_termios:
mov eax, 36h
mov ebx, stdin
mov ecx, 5401h
mov edx, termios
int 80h
ret
;~ #########################################
write_stdin_termios:
mov eax, 36h
mov ebx, stdin
mov ecx, 5402h
mov edx, termios
int 80h
ret
PrintNum:
push lpBufOut
push esi
call dwtoa
mov edi, lpBufOut
call GetStrlen
inc edx
mov ecx, lpBufOut
mov eax, sys_write
mov ebx, stdout
int 80H
ret
;~ #########################################
GetStrlen:
push ebx
xor ecx, ecx
not ecx
xor eax, eax
cld
repne scasb
mov byte [edi - 1], 10
not ecx
pop ebx
lea edx, [ecx - 1]
ret
;~ #########################################
dwtoa:
;~ number to convert = [ebp+8]
;~ pointer to buffer that receives number = [ebp+12]
push ebp
mov ebp, esp
push ebx
push esi
push edi
mov eax, [ebp + 8]
mov edi, [ebp + 12]
test eax, eax
jnz .sign
.zero:
mov word [edi], 30H
jmp .done
.sign:
jns .pos
mov byte [edi], "-"
neg eax
add edi, 1
.pos:
mov ecx, 3435973837
mov esi, edi
.doit:
mov ebx, eax
mul ecx
shr edx, 3
mov eax, edx
lea edx, [edx * 4 + edx]
add edx, edx
sub ebx, edx
add bl, "0"
mov [edi], bl
add edi, 1
cmp eax, 0
jg .doit
mov byte [edi], 0
.fixit:
sub edi, 1
mov al, [esi]
mov ah, [edi]
mov [edi], al
mov [esi], ah
add esi, 1
cmp esi, edi
jl .fixit
.done:
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 4 * 2
makefile
APP = terminos
$(APP): $(APP).o
ld -o $(APP) $(APP).o
$(APP).o: $(APP).asm
nasm -f elf $(APP).asm