Related
I am still struggling with g++ inline assembler and trying to understand how to use it.
I've adapted a piece of code from here: http://asm.sourceforge.net/articles/linasm.html (Quoted from the "Assembler Instructions with C Expressions Operands" section in gcc info files)
static inline uint32_t sum0() {
uint32_t foo = 1, bar=2;
uint32_t ret;
__asm__ __volatile__ (
"add %%ebx,%%eax"
: "=eax"(ret) // ouput
: "eax"(foo), "ebx"(bar) // input
: "eax" // modify
);
return ret;
}
I've compiled disabling optimisations:
g++ -Og -O0 inline1.cpp -o test
The disassembled code puzzles me:
(gdb) disassemble sum0
Dump of assembler code for function sum0():
0x00000000000009de <+0>: push %rbp ;prologue...
0x00000000000009df <+1>: mov %rsp,%rbp ;prologue...
0x00000000000009e2 <+4>: movl $0x1,-0xc(%rbp) ;initialize foo
0x00000000000009e9 <+11>: movl $0x2,-0x8(%rbp) ;initialize bar
0x00000000000009f0 <+18>: mov -0xc(%rbp),%edx ;
0x00000000000009f3 <+21>: mov -0x8(%rbp),%ecx ;
0x00000000000009f6 <+24>: mov %edx,-0x14(%rbp) ; This is unexpected
0x00000000000009f9 <+27>: movd -0x14(%rbp),%xmm1 ; why moving variables
0x00000000000009fe <+32>: mov %ecx,-0x14(%rbp) ; to extended registers?
0x0000000000000a01 <+35>: movd -0x14(%rbp),%xmm2 ;
0x0000000000000a06 <+40>: add %ebx,%eax ; add (as expected)
0x0000000000000a08 <+42>: movd %xmm0,%edx ; copying the wrong result to ret
0x0000000000000a0c <+46>: mov %edx,-0x4(%rbp) ; " " " " " "
0x0000000000000a0f <+49>: mov -0x4(%rbp),%eax ; " " " " " "
0x0000000000000a12 <+52>: pop %rbp ;
0x0000000000000a13 <+53>: retq
End of assembler dump.
As expected, the sum0() function returns the wrong value.
Any thoughts? What is going on? How to get it right?
-- EDIT --
Based on #MarcGlisse comment, I tried:
static inline uint32_t sum0() {
uint32_t foo = 1, bar=2;
uint32_t ret;
__asm__ __volatile__ (
"add %%ebx,%%eax"
: "=a"(ret) // ouput
: "a"(foo), "b"(bar) // input
: "eax" // modify
);
return ret;
}
It seems that the tutorial I've been following is misleading. "eax" in the output/input field does not mean the register itself, but e,a,x abbreviations on the abbrev table.
Anyway, I still do not get it right. The code above results in a compilation error: 'asm' operand has impossible constraints.
I don't see why.
The Extended inline assembly constraints for x86 are listed in the official documentation.
The complete documentation is also worth reading.
As you can see, the constraints are all single letters.
The constraint "eax" fo foo specifies three constraints:
a
The a register.
x
Any SSE register.
e
32-bit signed integer constant, or ...
Since you are telling GCC that eax is clobbered it cannot put the input operand there and it picks xmm0.
When the compiler selects the registers to use to represent the input operands, it does not use any of the clobbered registers
The proper constraint is simply "a".
You need to remove eax (by the way it should be rax due to zeroing of the upper bits) from the clobbers (and add "cc").
I'm trying to write a "hello world" program to test inline assembler in g++.
(still leaning AT&T syntax)
The code is:
#include <stdlib.h>
#include <stdio.h>
# include <iostream>
using namespace std;
int main() {
int c,d;
__asm__ __volatile__ (
"mov %eax,1; \n\t"
"cpuid; \n\t"
"mov %edx, $d; \n\t"
"mov %ecx, $c; \n\t"
);
cout << c << " " << d << "\n";
return 0;
}
I'm getting the following error:
inline1.cpp: Assembler messages:
inline1.cpp:18: Error: unsupported instruction `mov'
inline1.cpp:19: Error: unsupported instruction `mov'
Can you help me to get it done?
Tks
Your assembly code is not valid. Please carefully read on Extended Asm. Here's another good overview.
Here is a CPUID example code from here:
static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
}
Note the format:
first : followed by output operands: : "=a"(*a), "=d"(*d); "=a" is eax and "=b is ebx
second : followed by input operands: : "0"(code); "0" means that code should occupy the same location as output operand 0 (eax in this case)
third : followed by clobbered registers list: : "ebx", "ecx"
I kept #AMA answer as accepted one because it was complete enough. But I've put some thought on it and I concluded that it is not 100% correct.
The code I was trying to implement in GCC is the one below (Microsoft Visual Studio version).
int c,d;
_asm
{
mov eax, 1;
cpuid;
mov d, edx;
mov c, ecx;
}
When cpuid executes with eax set to 1, feature information is returned in ecx and edx.
The suggested code returns the values from eax ("=a") and edx (="d").
This can be easily seen at gdb:
(gdb) disassemble cpuid
Dump of assembler code for function cpuid(int, uint32_t*, uint32_t*):
0x0000000000000a2a <+0>: push %rbp
0x0000000000000a2b <+1>: mov %rsp,%rbp
0x0000000000000a2e <+4>: push %rbx
0x0000000000000a2f <+5>: mov %edi,-0xc(%rbp)
0x0000000000000a32 <+8>: mov %rsi,-0x18(%rbp)
0x0000000000000a36 <+12>: mov %rdx,-0x20(%rbp)
0x0000000000000a3a <+16>: mov -0xc(%rbp),%eax
0x0000000000000a3d <+19>: cpuid
0x0000000000000a3f <+21>: mov -0x18(%rbp),%rcx
0x0000000000000a43 <+25>: mov %eax,(%rcx) <== HERE
0x0000000000000a45 <+27>: mov -0x20(%rbp),%rax
0x0000000000000a49 <+31>: mov %edx,(%rax) <== HERE
0x0000000000000a4b <+33>: nop
0x0000000000000a4c <+34>: pop %rbx
0x0000000000000a4d <+35>: pop %rbp
0x0000000000000a4e <+36>: retq
End of assembler dump.
The code that generates something closer to what I want is (EDITED based on feedbacks on the comments):
static inline void cpuid2(uint32_t* d, uint32_t* c)
{
int a = 1;
asm volatile ( "cpuid" : "=d"(*d), "=c"(*c), "+a"(a) :: "ebx" );
}
The result is:
(gdb) disassemble cpuid2
Dump of assembler code for function cpuid2(uint32_t*, uint32_t*):
0x00000000000009b0 <+0>: push %rbp
0x00000000000009b1 <+1>: mov %rsp,%rbp
0x00000000000009b4 <+4>: push %rbx
0x00000000000009b5 <+5>: mov %rdi,-0x20(%rbp)
0x00000000000009b9 <+9>: mov %rsi,-0x28(%rbp)
0x00000000000009bd <+13>: movl $0x1,-0xc(%rbp)
0x00000000000009c4 <+20>: mov -0xc(%rbp),%eax
0x00000000000009c7 <+23>: cpuid
0x00000000000009c9 <+25>: mov %edx,%esi
0x00000000000009cb <+27>: mov -0x20(%rbp),%rdx
0x00000000000009cf <+31>: mov %esi,(%rdx)
0x00000000000009d1 <+33>: mov -0x28(%rbp),%rdx
0x00000000000009d5 <+37>: mov %ecx,(%rdx)
0x00000000000009d7 <+39>: mov %eax,-0xc(%rbp)
0x00000000000009da <+42>: nop
0x00000000000009db <+43>: pop %rbx
0x00000000000009dc <+44>: pop %rbp
0x00000000000009dd <+45>: retq
End of assembler dump.
Just to be clear... I know that there are better ways of doing it. But the purpose here is purely educational. Just want to understand how it works ;-)
-- edited (removed personal opinion) ---
I have a program which has been proved to run on an older version of codeblocks (ver 13.12) but does not seem to work when I try it on the newer version (ver 16.01). The purpose of the programme is to enter two integers which will then be added, mult etc. It uses asm code which I am new at. My question is why does it say windows has stopped responding after I type 2 integers and press enter?
Here is the code:
//Program 16
#include <stdio.h>
#include <iostream>
using namespace std;
int main() {
int arg1, arg2, add, sub, mul, quo, rem ;
cout << "Enter two integer numbers : " ;
cin >> arg1 >> arg2 ;
cout << endl;
asm ( "addl %%ebx, %%eax;" : "=a" (add) : "a" (arg1) , "b" (arg2) );
asm ( "subl %%ebx, %%eax;" : "=a" (sub) : "a" (arg1) , "b" (arg2) );
asm ( "imull %%ebx, %%eax;" : "=a" (mul) : "a" (arg1) , "b" (arg2) );
asm ( "movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;" : "=a" (quo), "=d" (rem) : "g" (arg1), "g" (arg2) );
cout<< arg1 << "+" << arg2 << " = " << add << endl;
cout<< arg1 << "-" << arg2 << " = " << sub << endl;
cout<< arg1 << "x" << arg2 << " = " << mul << endl;
cout<< arg1 << "/" << arg2 << " = " << quo << " ";
cout<< "remainder " << rem << endl;
return 0;
}
As Michael has said, your problem probably comes from your 4th asm statement being written incorrectly.
The first thing you need to understand when writing inline asm is what registers are and how they are used. Registers are a fundamental concept in x86 assembler programming, so if you don't know what they are, it's time for you to find an x86 assembly language primer.
Once you've got that, you need to understand that when compiler runs, it is using those registers in the code it generates. For example if you do for (int x=0; x<10; x++), x is (probably) going to end up in a register. So what happens if gcc decides to use ebx to hold the value of 'x', and then your asm statement stomps on ebx, putting some other value in it? gcc doesn't 'parse' your asm to figure out what you are doing. The only clue it has about what your asm does are those constraints listed after the asm instructions.
That's what Michael means when he says "the 4th ASM block doesn't list "EBX" in the clobber list (but its contents are destroyed)". If we look at your asm:
asm ("movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;"
: "=a" (quo), "=d" (rem)
: "g" (arg1), "g" (arg2));
You see that the 3rd line is moving a value into ebx, but there's nothing in the constraints that follow to say that it is going to be changed. The fact that your program is crashing is probably due to gcc using that register for something else. The simplest fix might be to "list EBX in the clobber list":
asm ("movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;"
: "=a" (quo), "=d" (rem)
: "g" (arg1), "g" (arg2)
: "ebx");
This tells gcc that ebx may be changed by the asm (aka it 'clobbers' it), and that it doesn't need to have any particular value when the asm statement begins, and won't have any particular value in it when the asm exits.
However, while that may be 'simplest,' it isn't necessarily the best. For example instead of using the "g" constraint for arg2, we can use the "b" constraint:
asm ("movl $0x0, %%edx;"
"movl %2, %%eax;"
"idivl %%ebx;"
: "=a" (quo), "=d" (rem)
: "g" (arg1), "b" (arg2));
This lets us get rid of the movl %3, %%ebx statement, since gcc will ensure the value is in ebx before calling the asm, and we don't need to clobber it anymore.
But why use ebx? idiv doesn't require any particular register there, and maybe gcc is already using ebx for something else. How about letting gcc just pick some register it isn't using? We do this using the "r" constraint:
asm ("movl $0x0, %%edx;"
"movl %2, %%eax;"
"idivl %3;"
: "=a" (quo), "=d" (rem)
: "g" (arg1), "r" (arg2));
Notice that the idiv now uses %3, which means "use the thing that is in the (zero-based) parameter #3." In this case, that's the register that contains arg2.
However, we can still do better. As you have already seen in your previous asm statements, you can use the "a" constraint to tell gcc to put a particular variable into the eax register. Which means we can do this:
asm ("movl $0x0, %%edx;"
"idivl %3;"
: "=a" (quo), "=d" (rem)
: "a" (arg1), "r" (arg2));
Again, 1 fewer instruction since we don't need to move the value into eax anymore. So how about that movl $0x0, %%edx thing? Well, we can get rid of that too:
asm ("idivl %3"
: "=a" (quo), "=d" (rem)
: "a" (arg1), "r" (arg2), "d" (0));
This uses the "d" constraint to put 0 into edx before executing the asm. That brings us to my final version:
asm ("idivl %3"
: "=a" (quo), "=d" (rem)
: "a" (arg1), "r" (arg2), "d" (0)
: "cc");
This says:
On input, put arg1 into eax, arg2 into some register (that we'll refer to using %3), and 0 into edx.
On output, eax will contain the quotient, edx will contain the remainder. This is how the idiv instruction works.
The "cc" clobber tells gcc that your asm modifies the flags registers (eflags), which idiv does as a side effect.
Now, despite having described all this, I usually think using inline asm is a bad idea. It's cool, it's powerful, it gives interesting insight into how the gcc compiler works. But look at all the weird things you "just have to know" in order to work with this. And as you have noticed, if you get any of them wrong, weird things can happen.
It's true all these things are documented in gcc's docs. The simple constraints (like "r" and "g") are doc'ed here. The specific register constraints for the x86 are in the 'x86 family' here. And the detailed description of all the asm features is here. So if you must use this stuff (for example if you are supporting some existing code that uses this), the information is out there.
But there's a much shorter read here that gives you a whole list of reasons not to use inline asm. That's the read I'd recommend. Stick with C, and let the compiler handle all that register junk for you.
PS While I'm at this:
asm ( "addl %2, %0" : "=r" (add) : "0" (arg1) , "r" (arg2) : "cc");
asm ( "subl %2, %0" : "=r" (sub) : "0" (arg1) , "r" (arg2) : "cc");
asm ( "imull %2, %0" : "=r" (mul) : "0" (arg1) , "r" (arg2) : "cc");
Check out the gcc docs to see what it means to use a digit in an input operand.
David Wohlferd has given a very good answer on how to better work with GCC extended assembly templates to do the work of your existing code.
A question may arise as to why the code presented fails with Codeblocks 16.01 w/GCC where as it may have worked previously. As it stands the code looks pretty simple, so what could have possibly gone wrong?
The best thing I recommend is learning to use the debugger and set break points in Codeblocks. It is very simple (but beyond the scope of this answer). You can learn more about debugging in the Codeblocks documentation.
If you used the debugger with Codeblocks 16.01, with a stock C++ console project you may have discovered that the program is giving you an Arithmetic Exception on the IDIV instruction in the assembly template. This is what appears in my console output:
Program received signal SIGFPE, Arithmetic exception.
These lines of code do as you would expect:
asm ( "addl %%ebx, %%eax;" : "=a" (add) : "a" (arg1) , "b" (arg2) );
asm ( "subl %%ebx, %%eax;" : "=a" (sub) : "a" (arg1) , "b" (arg2) );
asm ( "imull %%ebx, %%eax;" : "=a" (mul) : "a" (arg1) , "b" (arg2) );
This is where was have issues:
asm ( "movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;" : "=a" (quo), "=d" (rem) : "g" (arg1), "g" (arg2) );
One thing Codeblocks can do for you is show you the assembly code it generated. Pull down the Debug menu, select Debugging Windows > and Disassembly. The Watches and CPU Registers windows I highly recommend as well.
If you review the generated code with CodeBlocks 16.01 w/GCC you might discover it produced this:
/* Automatically produced by the assembly template for input constraints */
mov -0x20(%ebp),%eax /* EAX = value of arg1 */
mov -0x24(%ebp),%edx /* EDX = value of arg2 */
/* Our assembly template instructions */
mov $0x0,%edx /* EDX = 0 - we just clobbered the previous EDX! */
mov %eax,%eax /* EAX remains the same */
mov %edx,%ebx /* EBX = EDX = 0. */
idiv %ebx /* EBX is 0 so this is division by zero!! *
/* Automatically produced by the assembly template for output constraints */
mov %eax,-0x18(%ebp) /* Value at quo = EAX */
mov %edx,-0x1c(%ebp) /* Value at rem = EDX */
I have commented the code and it should be obvious why this code won't work. We effectively ended up placing zero in EBX and then attempted to use that as a divisor with IDIV and that produced an arithmetic exception (division by zero in this case).
This happened because GCC will (by default) assume that all the input operands are used (consumed) BEFORE the output operands are written to. We never told GCC that it couldn't potentially use the same input operands as output operands. GCC considers this situation an Early Clobber. It provides a mechanism to mark an output constraint as early clobber using & (ampersand) modifier:
`&'
Means (in a particular alternative) that this operand is an earlyclobber operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address.
By changing the operands so that the early clobbers are dealt with, we can place & on both the output constraints like this:
"idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) );
In this case arg1 and arg2 will not be passed in through any of the operands marked with &. This means this code will avoid using EAX and EDX for the input operands arg1 and arg2.
The other issue is that EBX is modified by your code but you don't tell GCC. You could simply add EBX to the clobber list in the assembly template like this:
"idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) : "ebx");
So this code should work, but is not efficient:
asm ( "movl $0x0, %%edx;"
"movl %2, %%eax;"
"movl %3, %%ebx;"
"idivl %%ebx;" : "=&a" (quo), "=&d" (rem) : "g" (arg1), "g" (arg2) : "ebx");
The generated code will now look something like:
/* Automatically produced by the assembler template for input constraints */
mov -0x30(%ebp),%ecx /* ECX = value of arg1 */
mov -0x34(%ebp),%esi /* ESI = value of arg2 */
/* Our assembly template instructions */
mov $0x0,%edx /* EDX = 0 */
mov %ecx,%eax /* EAX = ECX = arg1 */
mov %esi,%ebx /* EBX = ESI = arg2 */
idiv %ebx
/* Automatically produced by the assembler template for output constraints */
mov %eax,-0x28(%ebp) /* Value at quo = EAX */
mov %edx,-0x2c(%ebp) /* Value at rem = EDX */
This time the input operands for arg1 and arg2 didn't share the same registers that would conflict with the MOV instructions inside our inline assembly template.
Why other (including older) versions of GCC work?
If GCC had generated instructions using registers other than EAX, EDX, and EBX for arg1 and arg2 operands then it would have worked. But the fact it may have worked was just by luck. To see what happend with older Codeblocks and the GCC that came with it, I'd recommend reviewing the code generated in that environment the same way I have discussed above.
Early clobbering, and register clobbering in general is a reason that extended assembler templates can be tricky, and a reason extended assembler templates should be used sparingly especially if you don't have a solid understanding.
You can create code that appears to work, but is coded incorrectly. A different version of GCC or even different optimization levels may alter the behaviour of the code. Sometimes these bugs can be so subtle that as a program grows the bug manifests itself in other ways that may be hard to trace.
Another rule of thumb is that not all code you find on the internet is bug free, and the subtle complexities of extended inline assembly is often overlooked in tutorials. I discovered the code you used seems to be based on this Code Project. Unfortunately the author didn't have a thorough understanding of the intracies involved. The code may have worked at the time, but not necessarily now.
I've been trying to access the peb information of an executable as seen here: Access x64 TEB C++ & Assembly
The code works only in AT&T syntax for some odd reason but when I try to use Intel syntax, it fails to give the same value. There's of course an error on my part. So I'm asking..
How can I convert:
int main()
{
void* ptr = 0; //0x7fff5c4ff3c0
asm volatile
(
"movq %%gs:0x30, %%rax\n\t"
"movq 0x60(%%rax), %%rax\n\t"
"movq 0x18(%%rax), %%rax\n\t"
"movq %%rax, %0\n"
: "=r" (ptr) ::
);
}
to Intel Syntax?
I tried:
asm volatile
(
"movq rax, gs:[0x30]\n\t"
"movq rax, [rax + 0x60]\n\t"
"movq rax, [rax + 0x18]\n\t"
"movq rax, %0\n"
: "=r" (ptr) ::
);
and:
asm volatile
(
"mov rax, QWORD PTR gs:[0x30]\n\t"
"mov rax, QWORD PTR [rax + 0x60]\n\t"
"mov rax, QWORD PTR [rax + 0x18]\n\t"
"movq rax, %0\n" //mov rax, QWORD PTR [%0]\n
: "=r" (ptr) ::
);
They do not print the same value as the AT&T syntax: 0x7fff5c4ff3c0
Any ideas?
You forgot to reverse operand order on the last line. That said, the only instruction you need to have in asm is the first one due to the gs segment override, the rest could be done in C.
I'm writing a simple but a little specific program:
Purpose: calculate number from it's factorial
Requirements: all calculations must be done on gcc inline asm (at&t syntax)
Source code:
#include <iostream>
int main()
{
unsigned n = 0, f = 0;
std::cin >> n;
asm
(
"mov %0, %%eax \n"
"mov %%eax, %%ecx \n"
"mov 1, %%ebx \n"
"mov 1, %%eax \n"
"jmp cycle_start\n"
"cycle:\n"
"inc %%ebx\n"
"mul %%ebx\n"
"cycle_start:\n"
"cmp %%ecx, %%eax\n"
"jnz cycle\n"
"mov %%ebx, %1 \n":
"=r" (n):
"r" (f)
);
std::cout << f;
return 0;
}
This code causes SIGSEV.
Identic program on intel asm syntax (http://pastebin.com/2EqJmGAV) works fine. Why my "AT&T program" fails and how can i fix it?
#include <iostream>
int main()
{
unsigned n = 0, f = 0;
std::cin >> n;
__asm
{
mov eax, n
mov ecx, eax
mov eax, 1
mov ebx, 1
jmp cycle_start
cycle:
inc ebx
mul ebx
cycle_start:
cmp eax, ecx
jnz cycle
mov f, ebx
};
std::cout << f;
return 0;
}
UPD: Pushing to stack and restoring back used registers gives the same result: SIGSEV
You have your input and output the wrong way around.
So, start by altering
"=r" (n):
"r" (f)
to:
"=r" (f) :
"r" (n)
Then I suspect you'll want to tell the compiler about clobbers (registers you are using that aren't inputs or outputs):
So add:
: "eax", "ebx", "ecx"
after the two lines above.
I personally would make some other changes:
Use local labels (1: and 2: etc), which allows the code to be duplicated without "duplicate label".
Use %1 instead of %%ebx - that way, you are not using an extra register.
Move %0 directly to %%ecx. You are loading 1 into %%eax two instructions later, so what purpose has it got to do in %%eax?
[Now, I'ver written too much, and someone else has answered first... ]
Edit: And, as Anton points out, you need $1 to load the constant 1, 1 means read from address 1, which doesn't work well, and most likely is the cause of your problems
Hopefully there are no requirements to use nothing but gcc inline asm to figure it out. You can translate your AT&T example with nasm, then disassemble with objdump and see what's the right syntax.
I seem to recall that mov 1,%eax should be mov $1,%eax if you mean literal constant and not a memory reference.
An answer by #MatsPetersson is very useful regarding the interaction of your inline assembly with the compiler (clobbered/input/output registers). I've focused on the reason why you get SIGSEGV, and reading the address 1 does answer the question.