llvm IR use functions in libc - llvm

How does llvm IR use functions in libc, such as open socket, etc. is there a specific example,How does llvm IR use functions in libc, such as open socket, etc. is there a specific example,How does llvm IR use functions in libc, such as open socket, etc. is there a specific example,How does llvm IR use functions in libc, such as open socket, etc. is there a specific example

LLVM IR allows calls to functions by name. Just like in C the function must be declared. In LLVM IR, the syntax looks like:
;; Sample declaration of a function in libc.
declare i32 #strlen(i8*)
;; Test code using it.
define i32 #test(i8* %a, i8* %b) {
%A = call i32 #strlen(i8* %a)
%B = call i32 #strlen(i8* %b)
%c = add i32 %A, %B
ret i32 %c
}
You can always take a look at the textual LLVM IR generated by clang for any given C code. clang -S -emit-llvm client.c -o client.ll -O1 to produce client.ll with light optimization.

Related

clang BlocksRuntime embeds 'obsolete compiler' warning in executable when using __block

#include <stdio.h>
#include <Block.h>
int main()
{
__block int x = 5;
^{printf("x is %i\n", x);}();
}
When I use clang to compile a c (or c++) program that uses both clang's blocks and the __block type specifier, no compiler warnings are produced, even using -Wall and -Wpedantic. The program also runs as expected. However, if I open the executable using a text editor, I find this block of text (extracted using 'strings' command):
Block_release called upon a stack Block: %p, ignored
_Block_byref_release: Block byref data structure at %p underflowed
Block compiled by obsolete compiler, please recompile source for this Block
descriptor->dispose helper: %p
byref data block %p contents:
^%p (new layout) =
isa?: %p
refcount: %u
invoke: %p
descriptor: %p
descriptor->reserved: %lu
descriptor->size: %lu
descriptor->copy helper: %p
forwarding: %p
flags: 0x%x
size: %d
copy helper: %p
dispose helper: %p
NULL passed to _isa: stack Blockisa: malloc heapisa: GC heap Bloisa: global Blocisa: finalizing
The problem is that I am using the latest version of clang - hardly an 'obsolete' compiler. I am on Linux (musl) and I installed the BlocksRuntime from https://github.com/mackyle/blocksruntime. I also found the piece of code that generates the warning here - https://github.com/mackyle/blocksruntime/blob/master/BlocksRuntime/runtime.c#L629
λ: clang --version
clang version 10.0.0
Target: x86_64-unknown-linux-musl
Thread model: posix
InstalledDir: /bin
λ: uname -a
Linux thinkpad 5.7.9_1 #1 SMP Thu Jul 16 10:02:50 UTC 2020 x86_64 GNU/Linux
Is it safe to ignore this warning? If not, what can I do about it?
I've just realized that the warning is probably meant to be a runtime error stored as a string in the executable, which means that there is no problem. Disassembling seems to support this idea - radare2 shows the string in the .rodata section:
I really should have though of that earlier...

asm function with c++

I would like to add method to my class using assebler language. How can I do it?
example:
main.cpp
Struct ex {
int field1;
asm_method(char*);
}
add.asm
asm_method:
//some asm code
Get asm output the compiler generates for a non-inline definition of the C++ member function, and use that as a starting point for an asm source file. This works for any ISA with any compiler that can emit valid asm (which is most of them, although apparently MSVC emits a bunch of extra junk that you have to remove.)
Example with GCC (for x86-64 GNU/Linux, but works anywhere)
Also works with clang.
e.g. g++ -O3 -fverbose-asm -masm=intel -S -o foo_func.S foo.cpp (How to remove "noise" from GCC/clang assembly output?)
That .S file is now your asm source file. Remove the compiler-generated instruction lines and insert your own.
Obviously you need to know the calling convention and other stuff like that (e.g. for x86 see https://www.agner.org/optimize/#manuals for a calling convention guide), but this will get the compiler to do the name mangling for you, for that specific target platform's ABI.
struct ex { // lower case struct not Struct
int field1;
void *asm_method(char*); // methods need a return type
}; // struct declarations end with a ;
void *ex::asm_method(char*) {
return this; // easy way to find out what register `this` is passed in.
}
compiles as follows for x86-64 System V, with g++ -O3 (Godbolt with Linux gcc and Windows MSVC)
# x86-64 System V: GNU/Linux g++ -O3
# This is GAS syntax
.intel_syntax noprefix
.text # .text section is already the default at top of file
.align 2
.p2align 4 # aligning functions by 16 bytes is typical
.globl _ZN2ex10asm_methodEPc # the symbol is global, not private to this file
.type _ZN2ex10asm_methodEPc, #function # (optional) and it's a function.
_ZN2ex10asm_methodEPc: # a label defines the symbol
.cfi_startproc
## YOUR CODE GOES HERE ##
## RSP-8 is aligned by 16 in x86-64 SysV and Windows ##
mov rax, rdi # copy first arg (this) to return-value register.
ret # pop into program counter
.cfi_endproc
.size _ZN2ex10asm_methodEPc, .-_ZN2ex10asm_methodEPc # maybe non-optional for dynamic linking
It's probably fine to omit the .cfi stack-unwind directives from hand-written asm for leaf functions, since you're not going to be throwing C++ exceptions from hand-written asm (I hope).
This depends on your target platform and compiler/toolchain and is generally too broad a question for StackOverflow.
For example, the C++ compiler in the GCC toolchain actually generates assembly from C++, and then produces object files from that assembly. Then the linker links together multiple object files to produce an ELF module.
You can bypass the C++ compilation step for a single object file and directly write .asm files.
You can compile it the same way you compile .c: gcc myfile.S -o myfile.o.
Though you should take platform ABI into account such that you can accept function arguments and return values via the correct registers. The platform ABI also specifies the calling convention and which registers should be preserved across function calls. Finally, you need to produce correct function names according to C++ name mangling rules, or use C naming rules (which are simpler) and declare your function extern "C".
For more details see C++ to ASM linkage and for Linux ABI refer to System V ABI.
For Windows start here: calling conventions and compiling assembly in Visual Studio.

How to turn off the constant folding optimization in llvm

I am new to clang and llvm. I'm trying to generate an unoptimized version of bit code from a c source code. I found that the generated bit code is having the constant folding optimization which I don't want.
I'm using this command: clang -O0 -Xclang -disable-O0-optnone test1.c -S -emit-llvm -o test1.ll
The test1.c file has the following code:
int test() {
int y;
y = 2 * 4;
return y;
}
The content of the test1.ll file:
Instead of generating an instruction for multiplying 2 and 4, it is directly storing the value 8 by doing the constant folding operation:
store i32 8, i32* %1, align 4
It would be really nice if someone kindly let me know what I am missing and how should I turn off the constant folding optimization. The version of llvm I am using is 6.0.0.
Thank you.
It would be really nice if someone kindly let me know what I am missing and how should I turn off the constant folding optimization. The version of llvm I am using is 6.0.0.
It is a Clang feature and can't be turned off even with -O0. To workaround this try making variables global, pass them as parameters to the function, or just write the IR manually.

How to properly link pre-generated LLVM IR with runtime generated IR?

What is the proper way to load a set of pre-generated LLVM IR and make it available to runtime JIT modules such that the same types aren't given new names and inlining and const propagation can still take place?
My attempt so far:
I compile those C functions to LLVM IR offline via clang -c -emit-llvm -S -ffast-math -msse2 -O3 -o MyLib.ll MyLib.c
For each runtime JIT module, I load the generated LLVM IR via llvm::parseIRFile() and "paste" it into the runtime JIT module via llvm::Linker::linkModules().
This works fine for the first JIT module, but not for subsequent JIT modules created. Each time llvm::parseIRFile() is called, the resulting module's type definitions in the IR is given new names.
For example, offline MyLib.ll looks like this:
%struct.A = type { <4 x float> }
define <4 x float> #Foo(A*)(%struct.A* nocapture readonly)
{
...
}
The resulting module from the first call to llvm::parseIRFile() looks exactly as the offline version. The resulting module from the second call to llvm:parseIRFile() instead looks like:
%struct.A.1 = type { <4 x float> }
define <4 x float> #Foo(A*)(%struct.A.1* nocapture readonly)
{
...
}
Note that %struct.A was renamed to %struct.A.1. The runtime JIT module continues to generate code using %struct.A thus fails to call Foo() since it takes %struct.A.1 instead.

LLVM with CUDA inline assembly

I am trying to compile a CUDA code with following inline assembly:
static __device__ uint get_smid(void) {
uint ret;
asm("mov.u32 %0, %smid;" : "=r"(ret) );
return ret;
}
The code compiles fine with nvcc with a flag -Xptxas -v.
When i try to compile it with clang++ (version 4.0), with corresponding flag -Xcuda-ptxas -v (I think this is right, but I maybe mistaken), I get following error:
../../include/cutil_subset.h:23:25: error: invalid % escape in inline assembly string
asm("mov.u32 %0, %smid;" : "=r"(ret) );
It points to %smid.
I think I am suppose to link proper library but I have this too: L/cuda/install/lib.
Another possibility is NVPTX asm incompatibility. On this page, it is explained that LLVM has different definitions for all PTX variables (there are some for smid and warpid as well). Now I am lost if the mentioned code has to be separately (not inline) written and compiled as such.
Has anybody dealt with similar issue before? Suggestions are welcomed.
You need to reference the special register with a double percent sign: %%smid.
The %% escape sequence gets converted to a single percent sign during compilation, so that ptxas sees the correct special register name. The double percent sign version also works under nvcc.
nvcc seems to be more forgiving with escape sequences in inline assembler than clang++ is, and leaves unknown escape sequences untouched rather than emitting an error as clang does in this case.