I am given two DWORD values and will compare both of them using .IF, .ELSE, .ELSEIF to see which number is larger or if they are both equal. For example, two prompts are called to screen, which are "Enter Number 1" and "Enter Number 2". Number 1 and Number 2 are stored into separate registers using eax for num1 and ecx for num2. IF eax and ecx are equal, the equal prompt is called.
If they aren't, then here's where .ELSEIF comes in. eax is compared to ecx and vice versa.
The only problem is that number two is always larger than the first value, if they are not equal.
include asmlib.inc
.data
Prompt BYTE " Enter a number ", 0 ;Type number 1
Prompt2 BYTE " Enter another number ", 0 ;Type number 2
Large BYTE " Is larger ", 0 ;Larger number output
Equal BYTE " Is equal", 0 ;Numbers are equal output
num1 DWORD ? ;Number 1 is num1
num2 DWORD ? ;Number 2 is num2
.code
main PROC
mov edx, OFFSET Prompt ;Enter first number
call writeLine
call readInt
mov num1, eax
endl
mov edx, OFFSET Prompt2 ;Enter second num
call writeLine
call readInt
mov num2, ecx
endl
.IF eax == ecx
mov edx, OFFSET Equal ;display Equal output
call writeString ;display line
.ENDIF
.IF ecx > eax && eax < ecx
mov ecx, num2
call writeInt
mov edx, OFFSET Large
call writeString
.ELSEIF ecx < eax && eax > ecx
mov eax, num1
call writeInt
mov edx, OFFSET Large
call writeString
.ENDIF
exit
main ENDP
end main
mov edx, OFFSET Prompt2 ;Enter second num
call writeLine
call readInt
mov num2, ecx
Everything the .IF, .ELSEIF, .ELSE, or .ENDIF can do is invalidated because of an error retrieving the program's input.
readInt returns the integer in the EAX register, and so your mov num2, ecx instruction should read mov num2, eax
The only problem is that number two is always larger than the first value, if they are not equal.
If ecx > eax you display "Is larger", and if ecx < eax you also display "Is larger". What did you expect?
Prior to your first .IF eax == ecx instructions, you should load EAX from num1 and ECX from num2.
If the numbers happen to be equal, you don't want to use .ENDIF, because it makes you fall through and additionally (and needlessly) execute the comparison for "larger than".
mov eax, num1
mov ecx, num2
.IF eax == ecx
mov edx, OFFSET Equal
call writeString
.ELSEIF eax > ecx
call writeInt
mov edx, OFFSET Large
call writeString
mov eax, ecx ; ecx = num2
call writeInt
.ENDIF
Related
I am writing an assembly function callable from C++ that will read the CPU Vendor ID. Here is the function signature:
extern "C" void GetVendorID(const char* id);
Here is how I am calling it:
char vendorID[13];
GetVendorID(vendorID);
vendorID[12] = '\0';
Here is the important parts of the assembly:
global GetVendorID
GetVendorID:
push ebp
mov ebp, esp
push eax
push ebx
push ecx
push edx
mov eax, 0
cpuid ; <- this instruction moves the vendor id into ebx, edx, ecx
mov eax, [ebp + 8] ; <- move the value of the char pointer parameter into eax
; I have verified that this instruction works by returning eax and comparing it to the
; address of the vendorID array
; start with ebx
mov byte [eax], bl ; <- move a character into the char array
inc eax ; <- increment the pointer
shl ebx, 8 ; <- shift ebx to get the next character in its least significant bits
mov byte [eax], bl ; <- repeat
inc eax
shl ebx, 8
mov byte [eax], bl
inc eax
shl ebx, 8
mov byte [eax], bl
inc eax
shl ebx, 8
; above is repeated for edx and ecx
pop edx
pop ecx
pop ebx
pop eax
mov esp, ebp
pop ebp
ret
The way the string is stored in the registers is weird. The first character is stored in the least significant byte of ebx, the next is stored in the second least significant byte, and so on. That is why I am doing the left shifts.
I have verified that ebx, edx, ecx do contain the correct values by returning them from the function and printing them out. They contain "GenuineIntel". However, the char array remains unchanged. It is full of zeroes after the function returns.
I am not really sure why this isn't working. Am I accessing the parameter incorrectly?
My assignment is to Implement a function in assembly that would do the following:
loop through a sequence of characters and swap them such that the end result is the original string in reverse ( 100 points )
Hint: collect the string from user as a C-string then pass it to the assembly function along with the number of characters entered by the user. To find out the number of characters use strlen() function.
i have written both c++ and assembly programs and it works fine for extent: for example if i input 12345 the out put is correctly shown as 54321 , but if go more than 5 characters : the out put starts to be incorrect: for example if i input 123456 the output is :653241. i will greatly appreciate anyone who can point where my mistake is:
.code
_reverse PROC
push ebp
mov ebp,esp ;stack pointer to ebp
mov ebx,[ebp+8] ; address of first array element
mov ecx,[ebp+12] ; the number of elemets in array
mov eax,ebx
mov ebp,0 ;move 0 to base pointer
mov edx,0 ; set data register to 0
mov edi,0
Setup:
mov esi , ecx
shr ecx,1
add ecx,edx
dec esi
reverse:
cmp ebp , ecx
je allDone
mov edx, eax
add eax , edi
add edx , esi
Swap:
mov bl, [edx]
mov bh, [eax]
mov [edx],bh
mov [eax],bl
inc edi
dec esi
cmp edi, esi
je allDone
inc ebp
jmp reverse
allDone:
pop ebp ; pop ebp out of stack
ret ; retunr the value of eax
_reverse ENDP
END
and here is my c++ code:
#include<iostream>
#include <string>
using namespace std;
extern"C"
char reverse(char*, int);
int main()
{
char str[64] = {NULL};
int lenght;
cout << " Please Enter the text you want to reverse:";
cin >> str;
lenght = strlen(str);
reverse(str, lenght);
cout << " the reversed of the input is: " << str << endl;
}
You didn't comment your code, so IDK what exactly you're trying to do, but it looks like you are manually doing the array indexing with MOV / ADD instead of using an addressing mode like [eax + edi].
However, it looks like you're modifying your original value and then using it in a way that would make sense if it was unmodified.
mov edx, eax ; EAX holds a pointer to the start of array, read every iter
add eax , edi ; modify the start of the array!!!
add edx , esi
Swap:
inc edi
dec esi
EAX grows by EDI every step, and EDI increases linearly. So EAX increases geometrically (integral(x * dx) = x^2).
Single-stepping this in a debugger should have found this easily.
BTW, the normal way to do this is to walk one pointer up, one pointer down, and fall out of the loop when they cross. Then you don't need a separate counter, just cmp / ja. (Don't check for JNE or JE, because they can cross each other without ever being equal.)
Overall you the right idea to start at both ends of the string and swap elements until you get to the middle. Implementation is horrible though.
mov ebp,0 ;move 0 to base pointer
This seems to be loop counter (comment is useless or even worse); I guess idea was to swap length/2 elements which is perfectly fine. HINT I'd just compare pointers/indexes and exit once they collide.
mov edx,0 ; set data register to 0
...
add ecx,edx
mov edx, eax
Useless and misleading.
mov edi,0
mov esi , ecx
dec esi
Looks like indexes to start/end of the string. OK. HINT I'd go with pointers to start/end of the string; but indexes work too
cmp ebp , ecx
je allDone
Exit if did length/2 iterations. OK.
mov edx, eax
add eax , edi
add edx , esi
eax and edx point to current symbols to be swapped. Almost OK but this clobbers eax! Each loop iteration after second will use wrong pointers! This is what caused your problem in the first place. This wouldn't have happened if you used pointers instead indexes, or if you'd used offset addressing [eax+edi]/[eax+esi]
...
Swap part is OK
cmp edi, esi
je allDone
Second exit condition, this time comparing for index collision! Generally one exit condition should be enough; several exit conditions usually either superfluous or hint at some flaw in the algorithm. Also equality comparison is not enough - indexes can go from edi<esi to edi>esi during single iteration.
I am working on a project where we need to pass an array of type char as a parameter and reverse the array. I feel like I am very close to getting it done, but I am stuck on the actual swapping process.
For my swapping function in my .asm, I used the same method I would in c++ (use an unused register as a temp, then swap the front and the back.) What I am not understanding is how would I go about changing the actual content at that address. I assumed performing the following would "change" the content at the destination address:
mov eax,[edx]
However, this did not work as planned. After I ran a for loop to iterate through the array again, everything stayed the same.
If anyone can point me in the right direction, it would be great. I have provided the code below with as much comments as I could provide.
Also, I am doing all this in a single .asm file; however, my professor wants me to have 3 separate .asm document for each of the following functions: swap, reverse, and getLength. I tried to include the other 2 .asm document in the reverse.asm, but it kept giving me an error.
Assembly Code Starts:
.686
.model flat
.code
_reverse PROC
push ebp
mov ebp,esp ;Have ebp point to esp
mov ebx,[ebp+8] ;Point to beginning of array
mov eax,ebx
mov edx,1
mov ecx,0
mov edi,0
jmp getLength
getLength:
cmp ebp, 0 ;Counter to iterate until needed to stop
je setup
add ecx,1
mov ebp,[ebx+edx]
add edx,1
jmp getLength
setup: ;This is to set up the numbers correctly and get array length divided by 2
mov esi,ecx
mov edx,0
mov eax,ecx
mov ecx,2
div ecx
mov ecx,eax
add ecx,edx ;Set up ecx(Length of string) correctly by adding modulo if odd length string
mov eax,ebx
dec esi
jmp reverse
reverse: ;I started the reverse function by using a counter to iterate through length / 2
cmp edi, ecx
je allDone
mov ebx,eax ;Set ebx to the beginning of array
mov edx,eax ;Set edx to the beginning of array
add ebx,edi ;Move ebx to correct index to perform swap
add edx,esi ;Move edx to the back at the correct index
jmp swap ;Invoke swap function
swap:
mov ebp,ebx ;Move value to temp
mov ebx,[edx] ;Swap the back end value to the front
mov edx,[edx] ;Move temp to back
inc edi ;Increment to move up one index to set up next swap
dec esi ;Decrement to move back one index to set up for next swap
jmp reverse ;Jump back to reverse to setup next index swapping
allDone:
pop ebp
ret
_reverse ENDP
END
C++ Code starts:
#include <iostream>
#include <string>
using namespace std;
extern "C" char reverse(char*);
int main()
{
const int SIZE = 20;
char str1[SIZE] = { NULL };
cout << "Please enter a string: ";
cin >> str1;
cout << "Your string is: ";
for (int i = 0; str1[i] != NULL; i++)
{
cout << str1[i];
}
cout << "." << endl;
reverse(str1);
cout << "Your string in reverse is: ";
for (int i = 0; str1[i] != NULL; i++)
{
cout << str1[i];
}
cout << "." << endl;
system("PAUSE");
return 0;
}
So after many more hours of tinkering and looking around, I was finally able to figure out how to properly copy over a byte. I will post my .asm code below with comments if anybody needs it for future reference.
I was actually moving the content of the current address into a 32 bit registers. After I changed it from mov ebx,[eax] to mov bl,[eax], it copied the value correctly.
I will only post the code that I was having difficulty with so I do not give away the entire project for other students.
ASM Code Below:
swap:
mov bl,[edx] ;Uses bl since we are trying to copy a 1 byte char value
mov bh,[eax] ;Uses bh since we are trying to copy a 1 byte char value
mov [edx],bh ;Passing the value to the end of the array
mov [eax],bl ;Passing the value to the beginning of the array
inc eax ;Moving the array one index forward
dec edx ;Moving the array one index backwards
dec ecx ;Decreasing the counter by one to continue loop as needed
jmp reverse ;Jump back to reverse to check if additional swap is needed
Thanks for everyone that helped.
mov eax,[edx] (assuming intel syntax) places the 32 bits found in memory at address edx into eax. I.e, this code retrieves data from a memory location. If you'd like to write to a mem location, you need to reverse this, i.e mov [edx], eax
After playing with some 16 bit code overnight for sorting, I've the following two functions that may be of use. Obviously, you can't copy/paste them - you'll have to study it. However, you'll notice that it is able to swap items of arbitrary size. Perfect for swapping elements that are structures of some type.
; copies cx bytes from ds:si to es:di
copyBytes:
shr cx, 1
jnc .swapCopy1Loop
movsb
.swapCopy1Loop:
shr cx, 1
jnc .swapCopy2Loop
movsw
.swapCopy2Loop:
rep movsd
ret
; bp+0 bp+2 bp+4
;void swap(void *ptr1, void *ptr2, int dataSizeBytes)
swapElems:
push bp
mov bp, sp
add bp, 4
push di
push si
push es
mov ax, ds
mov es, ax
sub sp, [bp+4] ; allocate dataSizeBytes on the stack, starting at bp-6 - dataSizeBytes
mov di, sp
mov si, [bp+0]
mov cx, [bp+4]
call copyBytes
mov si, [bp+2]
mov di, [bp+0]
mov cx, [bp+4]
call copyBytes
mov si, sp
mov di, [bp+2]
mov cx, [bp+4]
call copyBytes
add sp, [bp+4]
pop es
pop si
pop di
pop bp
ret 2 * 3
I have two functions that take integers x and y read from input.
product returns x * y
power returns x ^ y, however it uses recursion and product to compute this. so x would be "base" and y is "exponent".
They called from C++:
int a, b, x, y;
a = product(x, y);
b = power(x, y);
and here is the asm. I got the product to work, however am having trouble with power because I am not sure of the syntax/method/convention to call product from it (and call itself for the recursion). EDIT: Recursion must be used.
global product
global power
section .text
product:
push ebp
mov ebp, esp
sub esp, 4
push edi
push esi
xor eax, eax
mov edi, [ebp+8]
mov esi, [ebp+12]
mov [ebp-4], edi
product_loop:
add [ebp-4], edi
mov eax, [ebp-4]
sub esi, 1
cmp esi, 1
jne product_loop
product_done:
pop esi
pop edi
mov esp, ebp
pop ebp
ret
power:
push ebp
mov ebp, esp
sub esp, 4
push edi
push esi
push ebx
xor eax, eax
mov edi, [ebp+8]
mov esi, [ebp+12]
;;;
check:
cmp esi, 1 ; if exp < 1
jl power_stop
recursion: ; else (PLEASE HELP!!!!!!!!)
; eax = call product (base, (power(base, exp-1))
power_stop:
mov eax, 1 ; return 1
power_done:
push ebx
pop esi
pop edi
mov esp, ebp
pop ebp
ret
EDIT: My solution!
power:
; Standard prologue
push ebp ; Save the old base pointer
mov ebp, esp ; Set new value of the base pointer
sub esp, 4 ; make room for 1 local variable result
push ebx ; this is exp-1
xor eax, eax ; Place zero in EAX. We will keep a running sum
mov eax, [ebp+12] ; exp
mov ebx, [ebp+8] ; base
cmp eax, 1 ; n >= 1
jge L1 ; if not, go do a recursive call
mov eax, 1 ; otherwise return 1
jmp L2
L1:
dec eax ; exp-1
push eax ; push argument 2: exp-1
push ebx ; push argument 1: base
call power ; do the call, result goes in eax: power(base, exp-1)
add esp, 8 ; get rid of arguments
push eax ; push argument 2: power(base, exponent-1)
push ebx ; push argument 1: base
call product ; product(base, power(base, exponent-1))
L2:
; Standard epilogue
pop ebx ; restore register
mov esp, ebp ; deallocate local variables
pop ebp ; Restore the callers base pointer.
ret ; Return to the caller.
You are using CDECL calling convention, so you have to first push the arguments in the stack in backward direction, then call the function and then clean the stack after the return.
push arg_last
push arg_first
call MyFunction
add esp, 8 ; the argument_count*argument_size
But here are some notes on your code:
Your function product does not return any value. Use mov eax, [ebp-4] immediately after product_done label.
Multiplication is much easy to be made by the instruction mul or imul. Using addition is the slowest possible way.
Computing the power by recursion is not the best idea. Use the following algorithm:
Y = 1;
if N=0 exit.
if N is odd -> Y = Y*x; N=N-1
if N is even -> Y = Y*Y; N=N/2
goto 2
Use SHR instruction in order to divide N by 2. Use test instrction in order to check odd/even number.
This way, you simply don't need to call product from power function.
If you're not sure how to write the assembly, you can generally write it in C++ and assemble it for clues - something like:
int power(int n, int exp)
{
return exp == 0 ? 1 :
exp == 1 ? n :
product(n, power(n, exp - 1));
}
Then you should just be able to use gcc -S or whatever your compiler's equivalent switch for assembly output is, or disassemble the machine code if you prefer.
For example, the function above, thrown in with int product(int x, int y) { return x * y; } and int main() { return product(3, 4); }, compiled with Microsoft's compiler ala cl /Fa power.cc:
; Listing generated by Microsoft (R) Optimizing Compiler Version 15.00.30729.01
TITLE C:\home\anthony\user\dev\power.cc
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC ?product##YAHHH#Z ; product
; Function compile flags: /Odtp
_TEXT SEGMENT
_x$ = 8 ; size = 4
_y$ = 12 ; size = 4
?product##YAHHH#Z PROC ; product
; File c:\home\anthony\user\dev\power.cc
; Line 1
push ebp
mov ebp, esp
mov eax, DWORD PTR _x$[ebp]
imul eax, DWORD PTR _y$[ebp]
pop ebp
ret 0
?product##YAHHH#Z ENDP ; product
_TEXT ENDS
PUBLIC ?power##YAHHH#Z ; power
; Function compile flags: /Odtp
_TEXT SEGMENT
tv73 = -8 ; size = 4
tv74 = -4 ; size = 4
_n$ = 8 ; size = 4
_exp$ = 12 ; size = 4
?power##YAHHH#Z PROC ; power
; Line 4
push ebp
mov ebp, esp
sub esp, 8
; Line 7
cmp DWORD PTR _exp$[ebp], 0
jne SHORT $LN5#power
mov DWORD PTR tv74[ebp], 1
jmp SHORT $LN6#power
$LN5#power:
cmp DWORD PTR _exp$[ebp], 1
jne SHORT $LN3#power
mov eax, DWORD PTR _n$[ebp]
mov DWORD PTR tv73[ebp], eax
jmp SHORT $LN4#power
$LN3#power:
mov ecx, DWORD PTR _exp$[ebp]
sub ecx, 1
push ecx
mov edx, DWORD PTR _n$[ebp]
push edx
call ?power##YAHHH#Z ; power
add esp, 8
push eax
mov eax, DWORD PTR _n$[ebp]
push eax
call ?product##YAHHH#Z ; product
add esp, 8
mov DWORD PTR tv73[ebp], eax
$LN4#power:
mov ecx, DWORD PTR tv73[ebp]
mov DWORD PTR tv74[ebp], ecx
$LN6#power:
mov eax, DWORD PTR tv74[ebp]
; Line 8
mov esp, ebp
pop ebp
ret 0
?power##YAHHH#Z ENDP ; power
_TEXT ENDS
PUBLIC _main
; Function compile flags: /Odtp
_TEXT SEGMENT
_main PROC
; Line 11
push ebp
mov ebp, esp
; Line 12
push 4
push 3
call ?power##YAHHH#Z ; power
add esp, 8
; Line 13
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
To walk you through this:
?power##YAHHH#Z PROC ; power
; Line 4
push ebp
mov ebp, esp
sub esp, 8
The above is the entry code for the power function - just adjusting the stack pointer to jump over the function arguments, which it will access below as _exp$[ebp] (that's exp) and _n$[ebp] (i.e. n).
; Line 7
cmp DWORD PTR _exp$[ebp], 0
jne SHORT $LN5#power
mov DWORD PTR tv74[ebp], 1
jmp SHORT $LN6#power
Basically, if exp is not equal to 0 we'll continue at label $LN5#power below, but if it is 0 then load 1 into the return value location on the stack at tv74[ebp] and jump to the function return instructions at $LN6#power.
$LN5#power:
cmp DWORD PTR _exp$[ebp], 1
jne SHORT $LN3#power
mov eax, DWORD PTR _n$[ebp]
mov DWORD PTR tv73[ebp], eax
jmp SHORT $LN4#power
Similar to the above - if exp is 1 then put n into eax and therefrom into the return value stack memory, then jump to the return instructions.
Now it starts to get interesting...
$LN3#power:
mov ecx, DWORD PTR _exp$[ebp]
sub ecx, 1
push ecx
Subtract 1 from exp and push in onto the stack...
mov edx, DWORD PTR _n$[ebp]
push edx
Also push n onto the stack...
call ?power##YAHHH#Z ; power
Recursively call the power function, which will use the two values pushes above.
add esp, 8
A stack adjustment after the function above returns.
push eax
Put the result of the recursive call - which the power return instructions leave in the eax register - onto the stack...
mov eax, DWORD PTR _n$[ebp]
push eax
Also push n onto the stack...
call ?product##YAHHH#Z ; product
Call the product function to multiple the value returned by the call to power above by n.
add esp, 8
mov DWORD PTR tv73[ebp], eax
Copy the result of product into a temporary address on the stack....
$LN4#power:
mov ecx, DWORD PTR tv73[ebp]
mov DWORD PTR tv74[ebp], ecx
Pick up the value from that tv73 temporary location and copy it into tv74...
$LN6#power:
mov eax, DWORD PTR tv74[ebp]
Finally, move the the product() result from tv74 into the eax register for convenient and fast access after the product call returns.
; Line 8
mov esp, ebp
pop ebp
ret 0
Clean up the stack and return.
i am learning assembly and i started experiments on SSE and MMX registers within the Digital-Mars C++ compiler (intel sytanx more easily readable). I have finished a program that takes var_1 as a value and converts it to the var_2 number system(this is in 8 bit for now. will expand it to 32 64 128 later) . Program does this by two ways:
__asm inlining
Usual C++ way of %(modulo) operator.
Question: Can you tell me more efficient way to use xmm0-7 and mm0-7 registers and can you tell me how to exchange exact bytes of them with al,ah... 8 bit registers?
Usual %(modulo) operator in the C++ usual way is very slow in comparison with __asm on my computer(pentium-m centrino 2.0GHz).
If you can tell me how to get rid of division instruction in __asmm, it will be even faster.
When i run the program it gives me:
(for the values: var_1=17,var_2=2,all loops are 200M times)
17 is 10001 in number system 2
__asm(clock)...........: 7250 <------too bad. it is 8-bit calc.
C++(clock).............: 12250 <------not very slow(var_2 is a power of 2)
(for the values: var_1=33,var_2=7,all loops are 200M times)
33 is 45 in number system 7
__asm(clock)..........: 2875 <-------not good. it is 8-bit calc.
C++(clock)............: 6328 <----------------really slow(var_2 is not a power of 2)
The second C++ code(the one with % operator): /////////////////////////////////////////////////////////
t1=clock();//reference time
for(int i=0;i<200000000;i++)
{
y=x;
counter=0;
while(y>g)
{
var_3[counter]=y%g;
y/=g;
counter++;
}
var_3[counter]=y%g;
}
t2=clock();//final time
_asm code:////////////////////////////////////////////////////////////////////////////////////////////////////////////
__asm // i love assembly in some parts of C++
{
pushf //here does register backup
push eax
push ebx
push ecx
push edx
push edi
mov eax,0h //this will be outer loop counter init to zero
//init of medium-big registers to zero
movd xmm0,eax //cannot set to immediate constant: xmm0=outer loop counter
shufps xmm0,xmm0,0h //this makes all bits zero
movd xmm1,eax
movd xmm2,eax
shufps xmm1,xmm1,0h
shufps xmm2,xmm2,0h
movd xmm2,eax
shufps xmm3,xmm3,0h//could have made pxor xmm3,xmm3(single instruction)
//init complete(xmm0,xmm1,xmm2,xmm3 are zero)
movd xmm1,[var_1] //storing variable_1 to register
movd xmm2,[var_2] //storing var_2 to register
lea ebx,var_3 //calculate var_3 address
movd xmm3,ebx //storing var_3's address to register
for_loop:
mov eax,0h
//this line is index-init to zero(digit array index)
movd edx,xmm2
mov cl,dl //this is the var_1 stored in cl
movd edx,xmm1
mov al,dl //this is the var_2 stored in al
mov edx,0h
dng:
mov ah,00h //preparation for a 8-bit division
div cl //divide
movd ebx,xmm3 //get var_3 address
add ebx,edx //i couldnt find a way to multiply with 4
add ebx,edx //so i added 4 times ^^
add ebx,edx //add
add ebx,edx //last adding
//below, mov [ebx],ah is the only memory accessing instruction
mov [ebx],ah //(8 bit)this line is equivalent to var_3[i]=remainder
inc edx //i++;
cmp al,00h //is division zero?
jne dng //if no, loop again
//here edi register has the number of digits
movd eax,xmm0 //get the outer loop counter from medium-big register
add eax,01h //j++;
movd xmm0,eax //store the new counter to medium-big register
cmp eax,0BEBC200h //is j<(200,000,000) ?
jb for_loop //if yes, go loop again
mov [var_3_size],edx //now we have number of digits too!
//here does registers revert back to old values
pop edi
pop edx
pop ecx
pop ebx
pop eax
popf
}
Whole code://///////////////////////////////////////////////////////////////////////////////////////
#include <iostream.h>
#include <cmath>
#include<stdlib.h>
#include<stdio.h>
#include<time.h>
int main()
{
srand(time(0));
clock_t t1=clock();
clock_t t2=clock();
int var_1=17; //number itself
int var_2=2; //number system
int var_3[100]; //digits to be showed(maximum 100 as seen )
int var_3_size=0;//asm block will decide what will the number of digits be
for(int i=0;i<100;i++)
{
var_3[i]=0; //here we initialize digits to zeroes
}
t1=clock();//reference time to take
__asm // i love assembly in some parts of C++
{
pushf //here does register backup
push eax
push ebx
push ecx
push edx
push edi
mov eax,0h //this will be outer loop counter init to zero
//init of medium-big registers to zero
movd xmm0,eax //cannot set to immediate constant: xmm0=outer loop counter
shufps xmm0,xmm0,0h //this makes all bits zero
movd xmm1,eax
movd xmm2,eax
shufps xmm1,xmm1,0h
shufps xmm2,xmm2,0h
movd xmm2,eax
shufps xmm3,xmm3,0h
//init complete(xmm0,xmm1,xmm2,xmm3 are zero)
movd xmm1,[var_1] //storing variable_1 to register
movd xmm2,[var_2] //storing var_2 to register
lea ebx,var_3 //calculate var_3 address
movd xmm3,ebx //storing var_3's address to register
for_loop:
mov eax,0h
//this line is index-init to zero(digit array index)
movd edx,xmm2
mov cl,dl //this is the var_1 stored in cl
movd edx,xmm1
mov al,dl //this is the var_2 stored in al
mov edx,0h
dng:
mov ah,00h //preparation for a 8-bit division
div cl //divide
movd ebx,xmm3 //get var_3 address
add ebx,edx //i couldnt find a way to multiply with 4
add ebx,edx //so i added 4 times ^^
add ebx,edx //add
add ebx,edx //last adding
//below, mov [ebx],ah is the only memory accessing instruction
mov [ebx],ah //(8 bit)this line is equivalent to var_3[i]=remainder
inc edx //i++;
cmp al,00h //is division zero?
jne dng //if no, loop again
//here edi register has the number of digits
movd eax,xmm0 //get the outer loop counter from medium-big register
add eax,01h //j++;
movd xmm0,eax //store the new counter to medium-big register
cmp eax,0BEBC200h //is j<(200,000,000) ?
jb for_loop //if yes, go loop again
mov [var_3_size],edx //now we have number of digits too!
//here does registers revert back to old values
pop edi
pop edx
pop ecx
pop ebx
pop eax
popf
}
t2=clock(); //finish time
printf("\n assembly_inline(clocks): %i for the 200 million calculations",(t2-t1));
printf("\n value %i(in decimal) is: ",var_1);
for(int i=var_3_size-1;i>=0;i--)
{
printf("%i",var_3[i]);
}
printf(" in the number system: %i \n",var_2);
//and: more readable form(end easier)
int counter=var_3_size;
int x=var_1;
int g=var_2;
int y=x;// backup
t1=clock();//reference time
for(int i=0;i<200000000;i++)
{
y=x;
counter=0;
while(y>g)
{
var_3[counter]=y%g;
y/=g;
counter++;
}
var_3[counter]=y%g;
}
t2=clock();//final time
printf("\n C++(clocks): %i for the 200 million calculations",(t2-t1));
printf("\n value %i(in decimal) is: ",x);
for(int i=var_3_size-1;i>=0;i--)
{
printf("%i",var_3[i]);
}
printf(" in the number system: %i \n",g);
return 0;
}
edit:
this is 32-bit version
void get_digits_asm()
{
__asm
{
pushf //couldnt store this in other registers
movd xmm0,eax//storing in xmm registers instead of pushing
movd xmm1,ebx//
movd xmm2,ecx//
movd xmm3,edx//
movd xmm4,edi//end of push backups
mov eax,[variable_x]
mov ebx,[number_system]
mov ecx,0h
mov edi,0h
begin_loop:
mov edx,0h
div ebx
lea edi,digits
mov [edi+ecx*4],edx
add ecx,01h
cmp eax,ebx
ja begin_loop
mov edx,0
div ebx
lea edi,digits
mov [edi+ecx*4],edx
inc ecx
mov [digits_total],ecx
movd edi,xmm4//pop edi
movd edx,xmm3//pop edx
movd ecx,xmm2//pop ecx
movd ebx,xmm1//pop ebx
movd eax,xmm0//pop eax
popf
}
}
The code can be much simpler of course: (modeled after the C++ version, does not include pushes and pops, and not tested)
mov esi,200000000
_bigloop:
mov eax,[y]
mov ebx,[g]
lea edi,var_3
; eax = y
; ebx = g
; edi = var_3
xor ecx,ecx
; ecx = counter
_loop:
xor edx,edx
div ebx
mov [edi+ecx*4],edx
add ecx,1
test eax,eax
jnz _loop
sub esi,1
jnz _bigloop
But I would be surprised if it was faster than the C++ version, and in fact it'll almost certainly be slower if the base is a power of two - all sane compilers know how to turn a division and/or modulo by a power of two into bitshifts and bitwise ands.
Here's a version that uses ab 8-bit division. Similar caveats apply, but now the division could even overflow (if y / g is more than 255).
mov esi,200000000
_bigloop:
mov eax,[y]
mov ebx,[g]
lea edi,var_3
; eax = y
; ebx = g
; edi = var_3
xor ecx,ecx
; ecx = counter
_loop:
div bl
mov [edi+ecx],ah
add ecx,1
and eax,0xFF
jnz _loop
sub esi,1
jnz _bigloop