Why isn't my inline assembly in C++ working? - c++

Strangest error output:
#include <iostream>
int main(int arg, char **LOC[])
{
asm
(
"mov eax, 0CF;"
"pusha;"
);
return 0;
}
It complains, and here is the error from GCC:
t.s: Assembler messages:
t.s:31: Error: too many memory references for `mov'

You get this error because your assembly is malformatted. Register accesses are done like %eax, $ is used for immediate operands. Furthermore, GCC, by default (see DanielKO's comment), uses the AT&T syntax, which has the destination on the right and the source on the left. Is this what you are looking for?
mov $0xcf, %eax
Also, your pusha is unbalanced, ie you don't clean up the stack correctly before you return from your function. It would be nice to know what your overall goal is, because right now it seems like you copied and pasted only an incomplete fraction of the source.

Related

Referencing memory operands in .intel_syntax GNU C inline assembly

I'm catching a link error when compiling and linking a source file with inline assembly.
Here are the test files:
via:$ cat test.cxx
extern int libtest();
int main(int argc, char* argv[])
{
return libtest();
}
$ cat lib.cxx
#include <stdint.h>
int libtest()
{
uint32_t rnds_00_15;
__asm__ __volatile__
(
".intel_syntax noprefix ;\n\t"
"mov DWORD PTR [rnds_00_15], 1 ;\n\t"
"cmp DWORD PTR [rnds_00_15], 1 ;\n\t"
"je done ;\n\t"
"done: ;\n\t"
".att_syntax noprefix ;\n\t"
:
: [rnds_00_15] "m" (rnds_00_15)
: "memory", "cc"
);
return 0;
}
Compiling and linking the program results in:
via:$ g++ -fPIC test.cxx lib.cxx -c
via:$ g++ -fPIC lib.o test.o -o test.exe
lib.o: In function `libtest()':
lib.cxx:(.text+0x1d): undefined reference to `rnds_00_15'
lib.cxx:(.text+0x27): undefined reference to `rnds_00_15'
collect2: error: ld returned 1 exit status
The real program is more complex. The routine is out of registers so the flag rnds_00_15 must be a memory operand. Use of rnds_00_15 is local to the asm block. It is declared in the C code to ensure the memory is allocated on the stack and nothing more. We don't read from it or write to it as far as the C code is concerned. We list it as a memory input so GCC knows we use it and wire up the "C variable name" in the extended ASM.
Why am I receiving a link error, and how do I fix it?
Compile with gcc -masm=intel and don't try to switch modes inside the asm template string. AFAIK there's no equivalent before clang14 (Note: MacOS installs clang as gcc / g++ by default.)
Also, of course you need to use valid GNU C inline asm, using operands to tell the compiler which C objects you want to read and write.
Can I use Intel syntax of x86 assembly with GCC? clang14 supports -masm=intel like GCC
How to set gcc to use intel syntax permanently? clang13 and earlier didn't.
I don't believe Intel syntax uses the percent sign. Perhaps I am missing something?
You're getting mixed up between %operand substitutions into the Extended-Asm template (which use a single %), vs. the final asm that the assembler sees.
You need %% to use a literal % in the final asm. You wouldn't use "mov %%eax, 1" in Intel-syntax inline asm, but you do still use "mov %0, 1" or %[named_operand].
See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html. In Basic asm (no operands), there is no substitution and % isn't special in the template, so you'd write mov $1, %eax in Basic asm vs. mov $1, %%eax in Extended, if for some reason you weren't using an operand like mov $1, %[tmp] or mov $1, %0.
uint32_t rnds_00_15; is a local with automatic storage. Of course it there's no asm symbol with that name.
Use %[rnds_00_15] and compile with -masm=intel (And remove the .att_syntax at the end; that would break the compiler-generate asm that comes after.)
You also need to remove the DWORD PTR, because the operand-expansion already includes that, e.g. DWORD PTR [rsp - 4], and clang errors on DWORD PTR DWORD PTR [rsp - 4]. (GAS accepts it just fine, but the 2nd one takes precendence so it's pointless and potentially misleading.)
And you'll want a "=m" output operand if you want the compiler to reserve you some scratch space on the stack. You must not modify input-only operands, even if it's unused in the C. Maybe the compiler decides it can overlap something else because it's not written and not initialized (i.e. UB). (I'm not sure if your "memory" clobber makes it safe, but there's no reason not to use an early-clobber output operand here.)
And you'll want to avoid label name conflicts by using %= to get a unique number.
Working example (GCC and ICC, but not clang unfortunately), on the Godbolt compiler explorer (which uses -masm=intel depending on options in the dropdown). You can use "binary mode" (the 11010 button) to prove that it actually assembles after compiling to asm without warnings.
int libtest_intel()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overridden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
( // ".intel_syntax noprefix \n\t"
"mov %[rnds_00_15], 1 \n\t"
"cmp %[rnds_00_15], 1 \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
:
: // no clobbers
);
return 0;
}
Compiles (with gcc -O3 -masm=intel) to this asm. Also works with gcc -m32 -masm=intel of course:
libtest_intel:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
je .Ldone8
.Ldone8:
xor eax, eax
ret
I couldn't get this to work with clang: It choked on .intel_syntax noprefix when I left that in explicitly.
Operand-size overrides:
You have to use %b[tmp] to get the compiler to substitute in BYTE PTR [rsp-4] to only access the low byte of a dword input operand. I'd recommend AT&T syntax if you want to do much of this.
Using %[rnds_00_15] results in Error: junk '(%ebp)' after expression.
That's because you switched to Intel syntax without telling the compiler. If you want it to use Intel addressing modes, compile with -masm=intel so the compiler can substitute into the template with the correct syntax.
This is why I avoid that crappy GCC inline assembly at nearly all costs. Man I despise this crappy tool.
You're just using it wrong. It's a bit cumbersome, but makes sense and mostly works well if you understand how it's designed.
Repeat after me: The compiler doesn't parse the asm string at all, except to do text substitutions of %operand. This is why it doesn't notice your .intel_syntax noprefex and keeps substituting AT&T syntax.
It does work better and more easily with AT&T syntax though, e.g. for overriding the operand-size of a memory operand, or adding an offset. (e.g. 4 + %[mem] works in AT&T syntax).
Dialect alternatives:
If you want to write inline asm that doesn't depend on -masm=intel or not, use Dialect alternatives (which makes your code super-ugly; not recommended for anything other than wrapping one or two instructions):
Also demonstrates operand-size overrides
#include <stdint.h>
int libtest_override_operand_size()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overriden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
(
"{movl $1, %[rnds_00_15] | mov %[rnds_00_15], 1} \n\t"
"{cmpl $1, %[rnds_00_15] | cmp %k[rnds_00_15], 1} \n\t"
"{cmpw $1, %[rnds_00_15] | cmp %w[rnds_00_15], 1} \n\t"
"{cmpb $1, %[rnds_00_15] | cmp %b[rnds_00_15], 1} \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
);
return 0;
}
With Intel syntax, gcc compiles it to:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
cmp WORD PTR [rsp-4], 1
cmp BYTE PTR [rsp-4], 1
je .Ldone38
.Ldone38:
xor eax, eax
ret
With AT&T syntax, compiles to:
movl $1, -4(%rsp)
cmpl $1, -4(%rsp)
cmpw $1, -4(%rsp)
cmpb $1, -4(%rsp)
je .Ldone38
.Ldone38:
xorl %eax, %eax
ret

Is it possible to use explicit register variables in GCC with C++17?

I am using explicit register variables to pass parameters to a raw Linux syscall using registers that don't have machine-specific constraints (such as r8, r9, r10 on x86_64) as suggested here.
#include <asm/unistd.h>
#ifdef __i386__
#define _syscallOper "int $0x80"
#define _syscallNumReg "eax"
#define _syscallRetReg "eax"
#define _syscallReg1 "ebx"
#define _syscallReg2 "ecx"
#define _syscallReg3 "edx"
#define _syscallReg4 "esi"
#define _syscallReg5 "edi"
#define _syscallReg6 "ebp"
#define _syscallClob
#else
#define _syscallOper "syscall"
#define _syscallNumReg "rax"
#define _syscallRetReg "rax"
#define _syscallReg1 "rdi"
#define _syscallReg2 "rsi"
#define _syscallReg3 "rdx"
#define _syscallReg4 "r10"
#define _syscallReg5 "r8"
#define _syscallReg6 "r9"
#define _syscallClob "rcx", "r11"
#endif
template <typename Ret = long, typename T1>
Ret syscall(long num, T1 arg1)
{
register long _num __asm__(_syscallNumReg) = num;
register T1 _arg1 __asm__(_syscallReg1) = arg1;
register Ret _ret __asm__(_syscallRetReg);
__asm__ __volatile__(_syscallOper
: "=r"(_ret)
: "r"(_num), "r"(_arg1)
: _syscallClob);
return _ret;
}
extern "C" void _start()
{
syscall(__NR_exit, 0);
}
However this feature requires the use of register keyword which was deprecated in C++11 and removed in C++17. So when I compile this code with GCC 7 (-std=c++17 -nostdlib) it gives me a warning:
ISO C++1z does not allow ‘register’ storage class specifier [-Wregister]
and it seems to ignore the register allocation and the program segfaults because syscall wasn't called properly. This code however compiles and works fine in Clang 6. Note: I actually have 6 syscall functions (up to 6 arguments) but only 1-argument version is shown here for the sake of minimal example.
I realize that register keyword by itself wasn't really useful that's why it was removed, but this specific use case seems like an exception to me so it seems unreasonable to remove compiler support for it as well.
I also realize that this use case is compiler-specific (i.e. non-standard), so my question is about compiler support rather then removal from the standard.
It appears you've found a GCC bug: GNU register-asm local variables don't work inside template functions. (clang compiles your example correctly). Apparently this was already a known bug, thanks for #Florian for finding it.
-Wregister triggering is just a symptom of this first bug: GNU register-asm local variables don't trigger the warning. But in a template, gcc compiles them as they were plain register int foo = bar; without the asm part of the declaration. So GCC thought you were just using plain register variables, not register-asm.
In a regular function, your code compiles fine with no warnings, even with -std=c++17.
#define T1 unsigned long
#define Ret T1
// template <typename Ret = long, typename T1>
... your code unchanged ...
__asm__ __volatile__(_syscallOper " #operands in %0, %1, %2"
...
On Godbolt with gcc7.3 -O3:
_start:
movl $60, %eax
xorl %edx, %edx
syscall #operands in %rax, %rax, %edx
ret
But clang6.0 doesn't have this bug, and we get:
_start: # #_start
movl $60, %eax
xorl %edi, %edi
syscall #operands in %rax, %rax, %edi
retq
Notice the asm comment I appended to your template (with C++ string-literal concatenation). We can just get the compiler to tell us what it thinks it's doing, instead of having to puzzle things out.
(Mostly posting this answer to discuss that debugging technique; Florian's answer already covers the specifics of this actual case.)
Instead of templates, you can use MUSL's existing portable headers:
It's a C library, so it might need a bit of extra casting to keep a C++ compiler happy. Or avoiding use of temporary expressions as lvalues, in the ARM headers.
But it should take care of most of the issues Florian pointed out. It has a permissive license, so you can just copy its syscall wrapper headers into your project. They work without linking against the rest of MUSL, and are truly inline.
http://git.musl-libc.org/cgit/musl/tree/arch/x86_64/syscall_arch.h is the x86-64 version.
This looks like a GCC bug to me. The C++17 warning is a red herring. The code works fine with optimization for me (when compiled with GCC 7), but it breaks at -O0.
According to the documentation for local register variables, this is not expected, so this is likely a GCC bug. According to this bug report, it is not even related to optimization, but ultimately caused by the use of a template.
I suggest to overload only on the number of system call arguments in the ultimate system call wrapper, and use long types for all arguments and the result:
inline long syscall_base(long num, long arg1)
{
register long _num __asm__(_syscallNumReg) = num;
register long _arg1 __asm__(_syscallReg1) = arg1;
register long _ret __asm__(_syscallRetReg);
__asm__ __volatile__(_syscallOper
: "=r"(_ret)
: "r"(_num), "r"(_arg1)
: _syscallClob);
return _ret;
}
template <typename Ret = long, typename T1>
Ret syscall(long num, T1 arg1)
{
return (Ret) (syscall_base(num, (long) arg1));
}
You'll have to use something nicer for the casts (probably type-indexed conversion functions), and of course you still have to deal with the syscall ABI variance in some other way (x32 has long long instead of long, and POWER has two return registers instead of one, etc.), but that's also a problem with your original approach.

Inserting a comment in __asm results in C2400 error (VS2012)

I was trying to check the compiled assembler of some code in VS 2012. I added two lines (before and after my code) as such:
__asm ; it begins here!
// My code
__asm ; it ends here!
However, VS didn't like that. I got
error C2400: inline assembler syntax error in 'opcode'; found 'bad token'
So I added a NOP, which I didn't want to:
__asm NOP ; Comment!
That worked fine. My question is twofold.
Why didn't VS allow me to add an assembly comment?
Is there a different way to add an assembly comment without adding an instruction, including NOP?
The reason it doesn't work is that __asm is a keyword, just like int is a keyword, it cannot appear by itself and must follow the proper syntax. Take the following bit of code as an example:
int main()
{
int // here's a comment, but it's ignored by the compiler
return 0;
}
The following code will fail with a compilation error, more specifically in VS2012 you get error C2143: syntax error : missing ';' before 'return'. This is an obvious error since we do not have the ending semi-colon to denote end of instruction; add the semi-colon and it compiles fine because we did not dis-obey the syntax of the C (or C++ in this case) language:
int main()
{
int // here's a comment, but it's ignored by the compiler
; // white space and comments are ignored by the compiler
return 0;
}
The same is true of the following code:
int main()
{
__asm ; here's a comment but it's ignored
return 0;
}
Except here we get the error error C2400: inline assembler syntax error in 'opcode'; found 'constant', becuase it's treating everything after the __asm keyword as an assembler instruction and the comment is being rightfully ignored .. so the following code WOULD work:
int main()
{
__asm ; here's a comment but it's ignored
NOP ; white space and comments are ignored by the compiler
__asm {; here's an __asm 'block'
} // outside of __asm block so only C style comments work
return 0;
}
So that answers your first question: Why didn't VS allow me to add an assembly comment?.. because it is a syntax error.
Now for your second question: Is there a different way to add an assembly comment without adding an instruction, including NOP?
Directly, no, there is not, but indirectly, yes there is. It's worth noting that the __asm keyword gets compiled into inline assembly in your program, so comments will be removed from the compiled assembly just as if it were a standard C/C++ comment, so trying to 'force' a comment in your assembly via that method is not necessary, instead, you can use the /FAs compiler flag and it will generate the assembly (machine code) mixed with the source, example:
Given the following (very simple) code:
int main()
{
// here's a normal comment
__asm { ; here's an asm comment and empty block
} // here's another normal comment
return 0;
}
When compiled with the /FAs compiler flag, the file.asm that was produced had the following output in it:
; Listing generated by Microsoft (R) Optimizing Compiler Version 18.00.31101.0
TITLE C:\test\file.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC _main
; Function compile flags: /Odtp
; File c:\test\file.cpp
_TEXT SEGMENT
_main PROC
; 2 : {
push ebp
mov ebp, esp
; 3 : // here's a normal comment
; 4 : __asm { ; here's an asm comment and empty block
; 5 : } // here's another normal comment
; 6 : return 0;
xor eax, eax
; 7 : }
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END
Notice how it includes the source and comments. If this code did more, you would see more assembly and the source associated with that as well.
If you're wanting to put comments in the inline assembly itself, then you can use normal C/C++ style comments as well as assembly comments within the __asm block itself:
int main()
{
// here's a C comment
__asm { ; here's an asm comment
// some other comments
NOP ; asm type comment
NOP // C style comment
} // here's another comment
return 0;
}
Hope that can help.
EDIT:
It should be noted the following bit of code also compiles without error and I'm not 100% sure why:
int main()
{
__asm
__asm ; comment
// also just doing it on a single line works too: __asm __asm
return 0;
}
Compiling this code with the single __asm ; comment gives the compilation error, but with both it compiles fine; adding instructions to the above code and inspecting the .asm output shows that the second __asm is ignored for any other assembly commands preceding it. So I'm not 100% sure if this is a parsing bug or part of the __asm keyword syntax as there's no documentation on this behavior.
On Linux, g++ accepts this:
__asm(";myComment");
and outputs, when you run g++ -S -O3 filename.cpp:
# 5 "filename.cpp" 1
;myComment
However, clang++ does not like it, and complains with this, when you run clang++ -S -O3 filename.cpp:
filename.cpp:5:9: error: invalid instruction mnemonic 'myComment'
__asm(";myComment");
^
<inline asm>:1:3: note: instantiated into assembly here
;myComment
^~~~~~~~~
I was, however, able to get both g++ and clang++ to accept:
__asm("//myComment");
which outputs the same comment as in the assembly output above, for both compilers.
What clued me into this, as I was unable to find it anywhere else on the internet, was reading from here:
Microsoft Specific
Instructions in an __asm block can use assembly-language comments:
C++
__asm mov ax, offset buff ; Load address of buff
Because C macros expand into a single logical line, avoid using
assembly-language comments in macros. (See Defining __asm Blocks as C
Macros.) An __asm block can also contain C-style comments; for more
information, see Using C or C++ in __asm Blocks.
END Microsoft Specific
This page then links to here and here. These provide more information on the matter.

gdb in Windows: different behaviour when debugging compiled C and C++ code

I've noticed a strange behaviour of GDB 7.5 on Windows. Consider the following C program:
int foo(void){
int i = 5;
return i;
}
int main(int argc, char** argv){
foo();
return 0;
}
When compiled as either Classic C or C++, the GDB disass foo command gives the same assembly code, as follows:
Dump of assembler code for function foo:
0x00401954 <+0>: push %ebp
0x00401955 <+1>: mov %esp,%ebp
0x00401957 <+3>: sub $0x10,%esp
0x0040195a <+6>: movl $0x5,-0x4(%ebp)
0x00401961 <+13>: mov -0x4(%ebp),%eax
0x00401964 <+16>: leave
0x00401965 <+17>: ret
End of assembler dump.
However, after inserting a breakpoint at the "leave" command, like so: br *0x00401964, running the code up to that line, and attempting to print out the variable i, the executables produced by compiling it as C and C++ behaves differently. The C executable works as expected and prints out $i = 5, while with the C++ executable GDB chokes up and says "no symbol i in current context".
So just out of curiosity I'd like to know if this is a GDB bug or feature? Or is the compiler (GCC) doing something subtly different so that there's something happening between the lines? Thanks.
EDIT:
Well, I don't think it's true the compiler removed the function completely, because breaking at the line before "leave" and printing the value of i does work.
This is neither bug/feature nor a side effect of compiler optimization.
The disassembly clearly is the output of a non-optmized build (i is written
to the stack in foo+6 and reread from stack one step later in foo+13).
While the assembly output of C and C++ is the same in this case, the debug symbol output however is slightly different. The scope of i is more limited in C++. I can only speculate for the reasons. I would guess that this is related to the fact that scoping is more complex in C++ (think of constructors, destructors, exception) and so the C++ part of gcc is stricter on scopes than the C part of gcc.
Details
(I checked everything on a 32-bit build but on a 64-bit Linux with gcc 4.8 and gdb 7.6. While some details will differ on Windows I expect the general mechanics to be the same)
Note that addresses differ in my case.
(gdb) disas foo
Dump of assembler code for function foo:
0x080483ed <+0>: push %ebp
0x080483ee <+1>: mov %esp,%ebp
0x080483f0 <+3>: sub $0x10,%esp
0x080483f3 <+6>: movl $0x5,-0x4(%ebp)
0x080483fa <+13>: mov -0x4(%ebp),%eax
0x080483fd <+16>: leave
0x080483fe <+17>: ret
End of assembler dump.
Technically, foo+0 and foo+1 are the function prologue, foo+3 to foo+13 is the function body, and foo+16 and foo+17 is the function epilogue. So only foo+3 to foo+13 represent the code between { and }. I would say that the C++ version is more correct in saying that i is out of scope before and after the function body.
To see that this is really a matter of debug symbols you can dump out gdb's internals of the debug structures with maintenance print symbols output_file_on_disk. For C it looks like:
block #000, object at 0x1847710, 1 syms/buckets in 0x80483ed..0x804840e
int foo(); block object 0x18470d0, 0x80483ed..0x80483ff
int main(int, char **); block object 0x18475d0, 0x80483ff..0x804840e section .text
block #001, object at 0x18476a0 under 0x1847710, 1 syms/buckets in 0x80483ed..0x804840e
typedef int int;
typedef char char;
block #002, object at 0x18470d0 under 0x18476a0, 1 syms/buckets in 0x80483ed..0x80483ff, function foo
int i; computed at runtime
block #003, object at 0x18475d0 under 0x18476a0, 2 syms/buckets in 0x80483ff..0x804840e, function main
int argc; computed at runtime
char **argv; computed at runtime
While this is C++
block #000, object at 0x1a3c790, 1 syms/buckets in 0x80483ed..0x804840e
int foo(); block object 0x1a3c0c0, 0x80483ed..0x80483ff
int main(int, char**); block object 0x1a3c640, 0x80483ff..0x804840e section .text
block #001, object at 0x1a3c720 under 0x1a3c790, 1 syms/buckets in 0x80483ed..0x804840e
typedef int int;
typedef char char;
block #002, object at 0x1a3c0c0 under 0x1a3c720, 0 syms/buckets in 0x80483ed..0x80483ff, function foo()
block #003, object at 0x1a3c050 under 0x1a3c0c0, 1 syms/buckets in 0x80483f3..0x80483fd
int i; computed at runtime
block #004, object at 0x1a3c640 under 0x1a3c720, 2 syms/buckets in 0x80483ff..0x804840e, function main(int, char**)
int argc; computed at runtime
char **argv; computed at runtime
So the debug symbols for the C++ code distinguish between the whole function (block #002) and the scope of the function body (block #003). This results in your observations.
(And to see that this is really not gdb just handling something wrong you can even analyze the binary with objdump on Linux or dumpbin on Windows. I did it on Linux and indeed it's the DWARF debug symbols that are different :-) )
It's not really a bug or a feature. The compiler is permitted to substitute functionally-equivalent code and generally does so if it can find a better way to do things. The example code is equivalent to doing nothing at all, so the compiler is free to remove it. This leaves the debugger with nothing to debug, which is good since debugging code that does nothing would be a waste of time anyway.

Assembler code in C++ code

How can I put Intel asm code into my c++ application?
I'm using Dev-C++.
I want to do sth like that:
int temp = 0;
int usernb = 3;
pusha
mov eax, temp
inc eax
xor usernb, usernb
mov eax, usernb
popa
This is only example.
How can I do sth like that?
UPDATE:
How does it look in Visual Studio ?
You can find a complete howto here http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
#include <stdlib.h>
int main()
{
int temp = 0;
int usernb = 3;
__asm__ volatile (
"pusha \n"
"mov eax, %0 \n"
"inc eax \n"
"mov ecx, %1 \n"
"xor ecx, %1 \n"
"mov %1, ecx \n"
"mov eax, %1 \n"
"popa \n"
: // no output
: "m" (temp), "m" (usernb) ); // input
exit(0);
}
After that you need to compile with something like:
gcc -m32 -std=c99 -Wall -Wextra -masm=intel -o casm casmt.c && ./casm && echo $?
output:
0
You need to compile with the -masm=intel flag since you want intel assembly syntax :)
UPDATE: How does it look in Visual Studio ?
If you are building for 64 bit, you cannot use inline assembly in Visual Studio. If you are building for 32 bit, then you use __asm to do the embedding.
Generally, using inline ASM is a bad idea.
You're probably going to produce worse ASM than a compiler.
Using any ASM in a method generally defeats any optimizations which try to touch that method (i.e. inlining).
If you need to access specific features of the processor not obvious in C++ (e.g. SIMD instructions) then you can use much more consistent with the language intrinsics provided by most any compiler vendor. Intrinsics give you all the speed of that "special" instruction but in a way which is compatible with the language semantics and with optimizers.
Here's a simple example to show the syntax for GCC/Dev-C++:
int main(void)
{
int x = 10, y;
asm ("movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(y) /* y is output operand */
:"r"(x) /* x is input operand */
:"%eax"); /* %eax is clobbered register */
}
It depends on your compiler. But from your tags I guess you use gcc/g++ then you can use gcc inline assembler. But the syntax is quite weird and a bit different from intel syntax, although it achieves the same.
EDIT: With Visual Studio (or the Visual C++ compiler) it get's much easier, as it uses the usual Intel syntax.
If it's for some exercices I'd recommend some real assembler avoiding inlined code as it can get rather messy/confusing.
Some basics using GCC can be found here.
If you're open to trying MSVC (not sure if GCC is a requirement), I'd suggest you have a look at MSVC's interpretation which is (in my opinion) a lot easier to read/understand, especially for learning assembler. An example can be found here.