LLVM-GCC ASM to LLVM in XCode - c++

I got the 2 following definition that compile (and work) just fine using XCode LLVM-GCC compiler:
#define SAVE_STACK(v)__asm { mov v, ESP }
#define RESTORE_STACK __asm {sub ESP, s }
However when I change the compiler to Apple LLVM I got the following error:
Expected '(' after 'asm'
I replace the {} with () but that doesn't do the trick, I google on that error couldn't find anything useful... anyone?

The __asm {...} style of inline assembly is non-standard and not supported by clang. Instead C++ specifies inline assembly syntax as asm("..."), note the quotes. Also clang uses AT&T assembly syntax so the macros would need to be rewritten to be safe.
However, some work has been going on to improve support for Microsoft's non-standard assembly syntax, and Intel style assembly along with it. There's an option -fenable-experimental-ms-inline-asm that enables what's been done so far, although I'm not sure when it was introduced or how good the support is in the version of clang you're using. A simple attempt with the code you show seems to work with a recent version of clang from the SVN trunk.
#define SAVE_STACK(v)__asm { mov v, ESP }
#define RESTORE_STACK __asm {sub ESP, s }
int main() {
int i;
int s;
SAVE_STACK(i);
RESTORE_STACK;
}
clang++ tmp.cpp -fms-extensions -fenable-experimental-ms-inline-asm -S -o -
.def main;
.scl 2;
.type 32;
.endef
.text
.globl main
.align 16, 0x90
main: # #main
# BB#0: # %entry
pushq %rax
#APP
.intel_syntax
mov dword ptr [rsp + 4], ESP
.att_syntax
#NO_APP
#APP
.intel_syntax
sub ESP, dword ptr [rsp]
.att_syntax
#NO_APP
xorl %eax, %eax
popq %rdx
ret
And the command clang++ tmp.cpp -fms-extensions -fenable-experimental-ms-inline-asm produces an executable that runs.
It does still produce warnings like the following though.
warning: MS-style inline assembly is not supported [-Wmicrosoft]

I have a problem using the XCode development environment the following code compiled correctly. Switching to my makefile I received the following error message Expected '(' after 'asm'
#define DebugBreak() { __asm { int 3 }; }
int main(int argc, const char *argv[])
{
DebugBreak();
}
Note that the definition for DebugBreak() came from my code that compiled under Visual Studio.
The way that I fix this in my make file was to added the argument -fasm-blocks
CFLAGS += -std=c++11 -stdlib=libc++ -O2 -fasm-blocks

Related

Referencing memory operands in .intel_syntax GNU C inline assembly

I'm catching a link error when compiling and linking a source file with inline assembly.
Here are the test files:
via:$ cat test.cxx
extern int libtest();
int main(int argc, char* argv[])
{
return libtest();
}
$ cat lib.cxx
#include <stdint.h>
int libtest()
{
uint32_t rnds_00_15;
__asm__ __volatile__
(
".intel_syntax noprefix ;\n\t"
"mov DWORD PTR [rnds_00_15], 1 ;\n\t"
"cmp DWORD PTR [rnds_00_15], 1 ;\n\t"
"je done ;\n\t"
"done: ;\n\t"
".att_syntax noprefix ;\n\t"
:
: [rnds_00_15] "m" (rnds_00_15)
: "memory", "cc"
);
return 0;
}
Compiling and linking the program results in:
via:$ g++ -fPIC test.cxx lib.cxx -c
via:$ g++ -fPIC lib.o test.o -o test.exe
lib.o: In function `libtest()':
lib.cxx:(.text+0x1d): undefined reference to `rnds_00_15'
lib.cxx:(.text+0x27): undefined reference to `rnds_00_15'
collect2: error: ld returned 1 exit status
The real program is more complex. The routine is out of registers so the flag rnds_00_15 must be a memory operand. Use of rnds_00_15 is local to the asm block. It is declared in the C code to ensure the memory is allocated on the stack and nothing more. We don't read from it or write to it as far as the C code is concerned. We list it as a memory input so GCC knows we use it and wire up the "C variable name" in the extended ASM.
Why am I receiving a link error, and how do I fix it?
Compile with gcc -masm=intel and don't try to switch modes inside the asm template string. AFAIK there's no equivalent before clang14 (Note: MacOS installs clang as gcc / g++ by default.)
Also, of course you need to use valid GNU C inline asm, using operands to tell the compiler which C objects you want to read and write.
Can I use Intel syntax of x86 assembly with GCC? clang14 supports -masm=intel like GCC
How to set gcc to use intel syntax permanently? clang13 and earlier didn't.
I don't believe Intel syntax uses the percent sign. Perhaps I am missing something?
You're getting mixed up between %operand substitutions into the Extended-Asm template (which use a single %), vs. the final asm that the assembler sees.
You need %% to use a literal % in the final asm. You wouldn't use "mov %%eax, 1" in Intel-syntax inline asm, but you do still use "mov %0, 1" or %[named_operand].
See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html. In Basic asm (no operands), there is no substitution and % isn't special in the template, so you'd write mov $1, %eax in Basic asm vs. mov $1, %%eax in Extended, if for some reason you weren't using an operand like mov $1, %[tmp] or mov $1, %0.
uint32_t rnds_00_15; is a local with automatic storage. Of course it there's no asm symbol with that name.
Use %[rnds_00_15] and compile with -masm=intel (And remove the .att_syntax at the end; that would break the compiler-generate asm that comes after.)
You also need to remove the DWORD PTR, because the operand-expansion already includes that, e.g. DWORD PTR [rsp - 4], and clang errors on DWORD PTR DWORD PTR [rsp - 4]. (GAS accepts it just fine, but the 2nd one takes precendence so it's pointless and potentially misleading.)
And you'll want a "=m" output operand if you want the compiler to reserve you some scratch space on the stack. You must not modify input-only operands, even if it's unused in the C. Maybe the compiler decides it can overlap something else because it's not written and not initialized (i.e. UB). (I'm not sure if your "memory" clobber makes it safe, but there's no reason not to use an early-clobber output operand here.)
And you'll want to avoid label name conflicts by using %= to get a unique number.
Working example (GCC and ICC, but not clang unfortunately), on the Godbolt compiler explorer (which uses -masm=intel depending on options in the dropdown). You can use "binary mode" (the 11010 button) to prove that it actually assembles after compiling to asm without warnings.
int libtest_intel()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overridden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
( // ".intel_syntax noprefix \n\t"
"mov %[rnds_00_15], 1 \n\t"
"cmp %[rnds_00_15], 1 \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
:
: // no clobbers
);
return 0;
}
Compiles (with gcc -O3 -masm=intel) to this asm. Also works with gcc -m32 -masm=intel of course:
libtest_intel:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
je .Ldone8
.Ldone8:
xor eax, eax
ret
I couldn't get this to work with clang: It choked on .intel_syntax noprefix when I left that in explicitly.
Operand-size overrides:
You have to use %b[tmp] to get the compiler to substitute in BYTE PTR [rsp-4] to only access the low byte of a dword input operand. I'd recommend AT&T syntax if you want to do much of this.
Using %[rnds_00_15] results in Error: junk '(%ebp)' after expression.
That's because you switched to Intel syntax without telling the compiler. If you want it to use Intel addressing modes, compile with -masm=intel so the compiler can substitute into the template with the correct syntax.
This is why I avoid that crappy GCC inline assembly at nearly all costs. Man I despise this crappy tool.
You're just using it wrong. It's a bit cumbersome, but makes sense and mostly works well if you understand how it's designed.
Repeat after me: The compiler doesn't parse the asm string at all, except to do text substitutions of %operand. This is why it doesn't notice your .intel_syntax noprefex and keeps substituting AT&T syntax.
It does work better and more easily with AT&T syntax though, e.g. for overriding the operand-size of a memory operand, or adding an offset. (e.g. 4 + %[mem] works in AT&T syntax).
Dialect alternatives:
If you want to write inline asm that doesn't depend on -masm=intel or not, use Dialect alternatives (which makes your code super-ugly; not recommended for anything other than wrapping one or two instructions):
Also demonstrates operand-size overrides
#include <stdint.h>
int libtest_override_operand_size()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overriden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
(
"{movl $1, %[rnds_00_15] | mov %[rnds_00_15], 1} \n\t"
"{cmpl $1, %[rnds_00_15] | cmp %k[rnds_00_15], 1} \n\t"
"{cmpw $1, %[rnds_00_15] | cmp %w[rnds_00_15], 1} \n\t"
"{cmpb $1, %[rnds_00_15] | cmp %b[rnds_00_15], 1} \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
);
return 0;
}
With Intel syntax, gcc compiles it to:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
cmp WORD PTR [rsp-4], 1
cmp BYTE PTR [rsp-4], 1
je .Ldone38
.Ldone38:
xor eax, eax
ret
With AT&T syntax, compiles to:
movl $1, -4(%rsp)
cmpl $1, -4(%rsp)
cmpw $1, -4(%rsp)
cmpb $1, -4(%rsp)
je .Ldone38
.Ldone38:
xorl %eax, %eax
ret

weak symbols and custom sections in inline assembly

I'm stuck with a problem which is illustrated by the following g++ code:
frob.hpp:
template<typename T> T frob(T x);
template<> inline int frob<int>(int x) {
asm("1: nop\n"
".pushsection \"extra\",\"a\"\n"
".quad 1b\n"
".popsection\n");
return x+1;
}
foo.cpp:
#include "frob.hpp"
extern int bar();
int foo() { return frob(17); }
int main() { return foo() + bar(); }
bar.cpp:
#include "frob.hpp"
int bar() { return frob(42); }
I'm doing these quirky custom section things as a way to mimick the mechanism here in the linux kernel (but in a userland and C++ way).
My problem is that the instantiation of frob<int> is recognized as a weak symbol, which is fine, and one of the two is eventually elided by the linker, which is fine too. Except that the linker is not disturbed by the fact that the extra section has references to that symbol (via .quad 1b), and the linker want to resolve them locally. I get:
localhost /tmp $ g++ -O3 foo.cpp bar.cpp
localhost /tmp $ g++ -O0 foo.cpp bar.cpp
`.text._Z4frobIiET_S0_' referenced in section `extra' of /tmp/ccr5s7Zg.o: defined in discarded section `.text._Z4frobIiET_S0_[_Z4frobIiET_S0_]' of /tmp/ccr5s7Zg.o
collect2: error: ld returned 1 exit status
(-O3 is fine because no symbol is emitted altogether).
I don't know how to work around this.
would there be a way to tell the linker to also pay attention to symbol resolution in the extra section too ?
perhaps one could trade the local labels for .weak global labels ? E.g. like in:
asm(".weak exception_handler_%=\n"
"exception_handler_%=: nop\n"
".pushsection \"extra\",\"a\"\n"
".quad exception_handler_%=\n"
".popsection\n"::);
However I fear that if I go this way, distinct asm statements in distinct compilation units may get the same symbol via this mechanism (may they ?).
Is there a way around that I've overlooked ?
g++ (5,6, at least) compiles an inline function with external linkage - such as
template<> inline int frob<int>(int x) - at a weak global
symbol in a [COMDAT] [function-section] in
its own section-group. See:-
g++ -S -O0 bar.cpp
bar.s
.file "bar.cpp"
.section .text._Z4frobIiET_S0_,"axG",#progbits,_Z4frobIiET_S0_,comdat
.weak _Z4frobIiET_S0_
.type _Z4frobIiET_S0_, #function
_Z4frobIiET_S0_:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
#APP
# 8 "frob.hpp" 1
1: nop
.pushsection "extra","a"
.quad 1b
.popsection
# 0 "" 2
#NO_APP
movl -4(%rbp), %eax
addl $1, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
...
...
The relevant directives are:
.section .text._Z4frobIiET_S0_,"axG",#progbits,_Z4frobIiET_S0_,comdat
.weak _Z4frobIiET_S0_
(The compiler-generated #APP and #NO_APP delimit your inline assembly).
Do as the compiler does by making extra likewise a COMDAT section in
a section-group:
frob.hpp (fixed)
template<typename T> T frob(T x);
template<> inline int frob<int>(int x) {
asm("1: nop\n"
".pushsection \"extra\", \"axG\", #progbits,extra,comdat" "\n"
".quad 1b\n"
".popsection\n");
return x+1;
}
and the linkage error will be cured:
$ g++ -O0 foo.cpp bar.cpp
$ ./a.out; echo $?
61

Clang built from source compiles C but not C++ code

I've recently compiled clang on windows (host: x86_64-pc-windows64 ; compiler: i686-pc-mingw32 ; target: i686-pc-mingw32).
The CMakeCache (for the config) can be found: here
My issue is that while clang works fine (for C), clang++ (for C++) will "successfully" compile and link, but the resulting program itself won't run and will exit with an error code 1. Here's a sample below (oh-my-zsh):
➜ bin cat test.c
#include <stdio.h>
int main()
{
printf("Hello World!\n");
return 0;
}
➜ bin cat test.cpp
#include <iostream>
int main()
{
std::cout<<"Hello World!"<<std::endl;
return 0;
}
➜ bin ./clang++ test.cpp -o a.exe
➜ bin ./clang test.c -o b.exe
➜ bin ./a.exe
➜ bin ./b.exe
Hello World!
➜ bin
as is visible here, b.exe (in C) works fine, but a.exe (C++), while compiled and links, gives no output.
Could anyone hint me unto why this is so, and how can I fix it?
Note: the pre-compiled snapshot of clang for windows (also 32 bit) works fine with my current path configuration.
Note: a.exe (C++, failed) returns non-zero.
DATA:
CLANG VERSIONS:
Snap: clang version 3.5 (208017) ; Comp: clang version 3.4 (tags/RELEASE_34/final)
LLVM FILES: snapshot ; compiled ; diff
PREPROCESSING FILES: snapshot ; compiled ; diff
ASM FILES: snapshot ; compiled ; diff
VERBOSE OUTPUT: snapshot ; compiled
You new clang uses different (incorrect) calling convention, not the x86_thiscallcc.
snap.s from good clang:
movl $__ZStL8__ioinit, %ecx
calll __ZNSt8ios_base4InitC1Ev
movl %esp, %ecx
movl $__ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, (%ecx)
movl %eax, %ecx
calll __ZNSolsEPFRSoS_E
Same code from your custom clang, comp.s:
leal __ZStL8__ioinit, %eax
movl %eax, (%esp)
calll __ZNSt8ios_base4InitC1Ev
movl %eax, (%esp)
movl %ecx, 4(%esp)
calll __ZNSolsEPFRSoS_E
and several other.
In llvm bitcode (*.ll files) right calling convention is marked with x86_thiscallcc in function definitions and after call instruction:
< call void #_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"* #_ZStL8__ioinit)
> call x86_thiscallcc void #_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"* #_ZStL8__ioinit)
< declare void #_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"*) #0
> declare x86_thiscallcc void #_ZNSt8ios_base4InitC1Ev(%"class.std::ios_base::Init"*) #0
32c33
< declare void #_ZNSt8ios_base4InitD1Ev(%"class.std::ios_base::Init"*) #0
> declare x86_thiscallcc void #_ZNSt8ios_base4InitD1Ev(%"class.std::ios_base::Init"*) #0
< call void #_ZNSt8ios_base4InitD1Ev(%"class.std::ios_base::Init"* #_ZStL8__ioinit)
> call x86_thiscallcc void #_ZNSt8ios_base4InitD1Ev(%"class.std::ios_base::Init"* #_ZStL8__ioinit)
< %3 = call %"class.std::basic_ostream"* #_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %2, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* #_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)
> %call1 = call x86_thiscallcc %"class.std::basic_ostream"* #_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %call, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* #_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)
< declare %"class.std::basic_ostream"* #_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"*, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)*) #0
> declare x86_thiscallcc %"class.std::basic_ostream"* #_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"*, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)*) #0
In preprocessed file I see the difference. In snap.E many functions are defined with __attribute__((__cdecl__)) and in comp.E they are defined with just __cdecl__. You should check why the definitions are different after preprocessing. I think, new clang may predefine different set of macro (gcc had -dM -E option to dump predefined, not know how to do this in clang). Or your clang just uses different headers (or different versions of headers, you can list used headers with -H option of clang compilation).
Other way is to check, is __attribute__((__cdecl__)) should be equal to __cdecl__, and does newer version of clang change anything in handling them.

Issues with SIMD functions in GNU c & c++

Environment Details:
Machine: Core i5 M540 processor running Centos 64 bits in a virtual machine in VMware player.
GCC: 4.8.2 built from source tar.
Issue:
I am trying to learn more about SIMD functions in C/C++ and for that I created the following helloworld program.
#include <iostream>
#include <pmmintrin.h>
int main(void){
__m128i a, b, c;
a = _mm_set_epi32(1, 1, 1, 1);
b = _mm_set_epi32(2, 3, 4, 5);
c = _mm_add_epi32(a,b);
std::cout << "Value of first int: " << c[0];
}
When I look at the assembly output for it using the following command I do not see the SIMD instructions.
g++ -S -I/usr/local/include/c++/4.8.2 -msse3 -O3 hello.cpp
Sample of the assembly generated:
movl $.LC2, %esi
movl $_ZSt4cout, %edi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movabsq $21474836486, %rsi
movq %rax, %rdi
call _ZNSo9_M_insertIxEERSoT_
xorl %eax, %eax
Please advise the correct way of writing or compiling the SIMD code.
Thanks you!!
It looks like your compiler is optimizing away the calls to _mm_foo_epi32, since all the values are known. Try taking all the relevant inputs from the user and see what happens.
Alternately, compile with -O0 instead of -O3 and see what happens.

Inline assembly troubles

I tried to compile with GCC inline assembly code which compiled fine with MSVC, but got the following errors for basic operations:
// var is a template variable in a C++ function
__asm__
{
mov edx, var //error: Register name not specified for %edx
push ebx //error: Register name not specified for %ebx
sub esp, 8 //error: Register name not specified for %esp
}
After looking through documentation covering the topic, I found out that I should probably convert (even if I am only interested in x86) Intel style assembly code to AT&T style. However, after trying to use AT&T style I got even more weird errors:
mov var, %edx //error: Expected primary-expression before % token
mov $var, edx //error: label 'LASM$$s' used but not defined
I should also note that I tried to use LLVM-GCC, but it failed miserably with internal errors after encountering inline assembly.
What should I do?
For Apple's gcc you want -fasm-blocks which allows you to omit gcc's quoting requirement for inline asm and also lets you use Intel syntax.
// test_asm.c
int main(void)
{
int var;
__asm__
{
mov edx,var
push ebx
sub esp,8
}
return 0;
}
Compile this with:
$ gcc -Wall -m32 -fasm-blocks test_asm.c -o test_asm
Tested with gcc 4.2.1 on OS X 10.6.
g++ inline assembler is much more flexible than MSVC, and much more complicated. It treats an asm directive as a pseudo-instruction, which has to be described in the language of the code generator. Here is a working sample from my own code (for MinGW, not Mac):
// int BNASM_Add (DWORD* result, DWORD* a, int len)
//
// result += a
int BNASM_Add (DWORD* result, DWORD* a, int len)
{
int carry ;
asm volatile (
".intel_syntax\n"
" clc\n"
" cld\n"
"loop03:\n"
" lodsd\n"
" adc [edx],eax\n"
" lea edx,[edx+4]\n" // add edx,4 without disturbing the carry flag
" loop loop03\n"
" adc ecx,0\n" // Return the carry flag (ecx known to be zero)
".att_syntax\n"
: "=c"(carry) // Output: carry in ecx
: "d"(result), "S"(a), "c"(len) // Input: result in edx, a in esi, len in ecx
) ;
return carry ;
}
You can find documentation at http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm.