I'm trying to experiment with llvm right now. I'd like to use languages that can be compiled to llvm bitcode for scripting. I've managed so far to load an llvm bitcode module and call a function defined in it from my 'internal' c++ code. I've next tried to expose a c++ function from my internal code to the jit'd code - so far in this effort I haven't managed to get anything but SEGFAULT.
My code is as follows. I've tried to create a Function and a global mapping in my execution engine that points to a function I'd like to call.
extern "C" void externGuy()
{
cout << "I'm the extern guy" << endl;
}
void ExposeFunction()
{
std::vector<Type*> NoArgs(0);
FunctionType* FT = FunctionType::get(Type::getVoidTy(getGlobalContext()), NoArgs, false);
Function* fnc = Function::Create(FT, Function::ExternalLinkage, "externGuy", StartModule);
JIT->addGlobalMapping(fnc, (void*)externGuy);
}
// ... Create module, create execution engine
ExposeFunction();
Is the problem that I can't add a function to the module after its been loaded from bitcode file?
Update:
I've refactored my code so that it reads like so instead:
// ... Create module, create execution engine
std::vector<Type*> NoArgs(0);
FunctionType* FT = FunctionType::get(Type::getVoidTy(getGlobalContext()), NoArgs, false);
Function* fnc = Function::Create(FT, Function::ExternalLinkage, "externGuy", m);
fnc->dump();
JIT->addGlobalMapping(fnc, (void*)externGuy);
So instead of segfault I get:
Program used external function 'externGuy' which could not be resolved
Also, the result of dump() prints:
declare void #externGuy1()
If I change my c++ script bitcode thing to call externGuy1() instead of externGuy() it will suggest to me that I meant to use the externGuy. The addGlobalMapping just doesn't seem to be working for me. I'm not sure what I'm missing here. I also added -fPIC to my compilation command like I saw suggested in another question - I'm honestly not sure if its helped anything but no harm in trying.
I finally got this to work. My suspicion is to maybe creating the function, as well as defining it in the script is causing perhaps more than 1 declaration of the functions name and the mapping just wasn't working out. What I did was define the function in the script and then use getFunction to use for mapping instead. Using the dump() method on the function output:
declare void #externGuy() #1
Which is why I think that had something to do with the mapping not working originally. I also remember the Kaleidoscope tutorial saying that getPointerToFunction would do the JIT compiling when its called if it isn't already done so I thought that I would have to do the mapping before this call.
So altogether to get this whole thing working it was as follows:
// Get the function and map it
Function* extrn = m->getFunction("externGuy");
extrn->dump();
JIT->addGlobalMapping(extrn, (void*)&::externGuy);
// Get a pointer to the jit compiled function
Function* mane = m->getFunction("hello");
void* fptr = JIT->getPointerToFunction(mane);
// Make a call to the jit compiled function which contains a call to externGuy
void (*FP)() = (void(*)())(intptr_t)fptr;
FP();
Related
I am trying to compile a C++ library to wasm32 wasi for use inside my rust application. However I am running into this issue.
Error: failed to run main module `target/wasm32-wasi/release/so.wasm`
Caused by:
0: failed to instantiate "target/wasm32-wasi/release/so.wasm"
1: unknown import: `env::_ZdlPv` has not been defined
This error relates to line 5 in my mylib.cpp. Which just declares a variable
#include "OpenXLSX/OpenXLSX.hpp"
extern "C" int test() {
std::string i = "";
OpenXLSX::XLXmlData s;
return 0;
}
my main.rs
#[link(name = "mylib")]
extern "C" {
pub fn test() -> i32;
}
pub fn main() {
let res = unsafe { test() };
println!("test code: {}", res);
}
and my build.rs, where I assume the error is
use std::env;
fn main() {
cc::Build::new()
.cpp_link_stdlib(None)
.cpp(true)
.flag("-std=c++17")
.archiver("llvm-ar")
.include("OpenXLSX/external/pugixml")
.include("OpenXLSX/external/nowide")
.include("OpenXLSX/external/zippy")
.include("OpenXLSX/headers")
.include("OpenXLSX")
.flag("--sysroot=/opt/wasi-sysroot")
.flag("-fvisibility=default")
.file("OpenXLSX/sources/XLCell.cpp")
.file("OpenXLSX/sources/XLCellIterator.cpp")
.file("OpenXLSX/sources/XLCellRange.cpp")
.file("OpenXLSX/sources/XLCellReference.cpp")
.file("OpenXLSX/sources/XLCellValue.cpp")
.file("OpenXLSX/sources/XLColor.cpp")
.file("OpenXLSX/sources/XLColumn.cpp")
.file("OpenXLSX/sources/XLContentTypes.cpp")
.file("OpenXLSX/sources/XLDateTime.cpp")
.file("OpenXLSX/sources/XLDocument.cpp")
.file("OpenXLSX/sources/XLFormula.cpp")
.file("OpenXLSX/sources/XLProperties.cpp")
.file("OpenXLSX/sources/XLRelationships.cpp")
.file("OpenXLSX/sources/XLRow.cpp")
.file("OpenXLSX/sources/XLRowData.cpp")
.file("OpenXLSX/sources/XLSharedStrings.cpp")
.file("OpenXLSX/sources/XLSheet.cpp")
.file("OpenXLSX/sources/XLWorkbook.cpp")
.file("OpenXLSX/sources/XLXmlData.cpp")
.file("OpenXLSX/sources/XLXmlFile.cpp")
.file("OpenXLSX/sources/XLZipArchive.cpp")
.file("OpenXLSX/external/pugixml/pugixml.cpp")
.compile("OpenXLSX");
cc::Build::new()
.archiver("llvm-ar")
.cpp_link_stdlib(None)
.cpp(true)
.flag("-fvisibility=default")
.flag("-std=c++17")
.include("OpenXLSX/external/pugixml")
.include("OpenXLSX/headers")
.include("OpenXLSX")
.flag("--sysroot=/opt/wasi-sysroot")
.file("mylib.cpp")
.compile("libmylib.a");
}
In a separate attempt, Instead of step 1 in my build.rs where I try to link the OpenXLSX files, I have also used cmake to generate a single.a file compiled for wasm32-wasi with wasi sysroot, that I tried loading.
let src_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
println!("cargo:rustc-link-lib=static=OpenXLSX");
println!("cargo:rustc-link-search=native={}/wasm-libs", src_dir);
And I tried to generate the bindings with Bindgen, it gave me duplicate definition errors. Which is the main reason I am writing my code in a c++ function and trying to call that from rust. I assume there would still be a linking issue,even if I get bindgen to work.
In short, your c++ code is using operator delete(void*) but you are not linking the c++ standard library that usually provides the implementation for it.
In a bit more detail:
_ZdlPv is the mangled name of operator delete(void*). That's the standard C++ delete operator that's used to free objects allocated on the heap with new. Check this SO answer. You can also check the name on Demangler.com.
On line 5 OpenXLSX::XLXmlData s; you are creating an XLXmlData instance. XLXmlData has a std::unique_ptr as a member. So, when a XLXmlData object goes out of scope (at the end of test()) its destructor ~XLXmlData() is called and in it, the compiler generates code to clean-up all members, including the unique_ptr. This code, contains a call to the delete operator.Check this (abridged) godbolt.org version of XLXmlData to see what code gets generated. The call to delete is on line 38.
Even though delete is a C++ keyword, under the hood, the compiler emits just a regular function call (under the mangled _ZdlPv name, as we already saw), so something needs to provide the implementation for that symbol. This "something" typically is the C++ standard library.
However, your build.rs passes None to .cpp_link_stdlib(). According to the docs this disables automatic linking, so the linker will not look for the implementation of delete in the standard library. As nothing else implements delete, this ultimately leads to the unknown import error you are seeing.
To recap, you should link with the standard library. A lot of C++ stuff simply will not work without it.
PROBLEM:
I currently have a traditional module instrumentation pass that
inserts new function calls into a given IR according to some logic
(inserted functions are external from a small lib that is later linked
to given program). Running experiments, my overhead is from
the cost of executing a function call to the library function.
What I am trying to do:
I would like to inline these function bodies into the IR of
the given program to get rid of this bottleneck. I assume an intrinsic
would be a clean way of doing this, since an intrinsic function would
be expanded to its function body when being lowered to ASM (please
correct me if my understanding is incorrect here, this is my first
time working with intrinsics/LTO).
Current Status:
My original library call definition:
void register_my_mem(void *user_vaddr){
... C code ...
}
So far:
I have created a def in: llvm-project/llvm/include/llvm/IR/IntrinsicsX86.td
let TargetPrefix = "x86" in {
def int_x86_register_mem : GCCBuiltin<"__builtin_register_my_mem">,
Intrinsic<[], [llvm_anyint_ty], []>;
}
Added another def in:
otwm/llvm-project/clang/include/clang/Basic/BuiltinsX86.def
TARGET_BUILTIN(__builtin_register_my_mem, "vv*", "", "")
Added my library source (*.c, *.h) to the compiler-rt/lib/test_lib
and added to CMakeLists.txt
Replaced the function insertion with trying to insert the intrinsic
instead in: llvm/lib/Transforms/Instrumentation/myModulePass.cpp
WAS:
FunctionCallee sm_func =
curr_inst->getModule()->getOrInsertFunction("register_my_mem",
func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
NEW:
Intrinsic::ID aREGISTER(Intrinsic::x86_register_my_mem);
Function *sm_func = Intrinsic::getDeclaration(currFunc->getParent(),
aREGISTER, func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
Questions:
If my logic for inserting the intrinsic functions shouldnt be a
module pass, where do i put it?
Am I confusing LTO with intrinsics?
Do I put my library function definitions into the following files as mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114322.html as for example EmitRegisterMyMem()?
clang/lib/CodeGen/CodeGenFunction.cpp - define llvm::Instrinsic::ID
clang/lib/CodeGen/CodeGenFunction.h - declare llvm::Intrinsic::ID
My LLVM compiles, so it is semantically correct, but currently when
trying to insert this function call, LLVM segfaults saying "Not a valid type for function argument!"
I'm seeing multiple issues here.
Indeed, you're confusing LTO with intrinsics. Intrinsics are special "functions" that are either expanded into special instructions by a backend or lowered to library function calls. This is certainly not something you're going to achieve. You don't need an intrinsic at all, you'd just need to inline the function call in question: either by hands (from your module pass) or via LTO, indeed.
The particular error comes because you're declaring your intrinsic as receiving an integer argument (and this is how the declaration would look like), but:
asking the declaration of variadic intrinsic with invalid type (I'd assume your func_type is a non-integer type)
passing pointer argument
Hope this makes an issue clear.
See also: https://llvm.org/docs/LinkTimeOptimization.html
Thanks you for clearing up the issue #Anton Korobeynikov.
After reading your explanation, I also believe that I have to use LTO to accomplish what I am trying to do. I especially found this link very useful: https://llvm.org/docs/LinkTimeOptimization.html. It seems that I am now on a right path.
I have a working set of TCL script plus C++ extension but I dont know exactly how it works and how was it compiled. I am using gcc and linux Arch.
It works as follows: when we execute the test.tcl script it will pass some values to an object of a class defined into the C++ extension. Using these values the extension using a macro give some result and print some graphics.
In the test.tcl scrip I have:
#!object
use_namespace myClass
proc simulate {} {
uplevel #0 {
set running 1
for {} {$running} { } {
moveBugs
draw .world.canvas
.statusbar configure -text "t:[tstep]"
}
}
}
set toroidal 1
set nx 100
set ny 100
set mv_dist 4
setup $nx $ny $mv_dist $toroidal
addBugs 100
# size of a grid cell in pixels
set scale 5
myClass.scale 5
The object.cc looks like:
#include //some includes here
MyClass myClass;
make_model(myClass); // --> this is a macro!
The Macro "make_model(myClass)" expands as follows:
namespace myClass_ns { DEFINE_MYLIB_LIBRARY; int TCL_obj_myClass
(mylib::TCL_obj_init(myClass),TCL_obj(mylib::null_TCL_obj,
(std::string)"myClass",myClass),1); };
The Class definition is:
class MyClass:
{
public:
int tstep; //timestep - updated each time moveBugs is called
int scale; //no. pixels used to represent bugs
void setup(TCL_args args) {
int nx=args, ny=args, moveDistance=args;
bool toroidal=args;
Space::setup(nx,ny,moveDistance,toroidal);
}
The whole thing creates a cell-grid with some dots (bugs) moving from one cell to another.
My questions are:
How do the class methods and variables get the script values?
How is possible to have c++ code and compile it without a main function?
What is that macro doing there in the extension and how it works??
Thanks
Whenever a command in Tcl is run, it calls a function that implements that command. That function is written in a language like C or C++, and it is passed in the arguments (either as strings or Tcl_Obj* values). A full extension will also include a function to do the library initialisation; the function (which is external, has C linkage, and which has a name like Foo_Init if your library is foo.dll) does basic setting up tasks like registering the implementation functions as commands, and it's explicit because it takes a reference to the interpreter context that is being initialised.
The implementation functions can do pretty much anything they want, but to return a result they use one of the functions Tcl_SetResult, Tcl_SetObjResult, etc. and they have to return an int containing the relevant exception code. The usual useful ones are TCL_OK (for no exception) and TCL_ERROR (for stuff's gone wrong). This is a C API, so C++ exceptions aren't allowed.
It's possible to use C++ instance methods as command implementations, provided there's a binding function in between. In particular, the function has to get the instance pointer by casting a ClientData value (an alias for void* in reality, remember this is mostly a C API) and then invoking the method on that. It's a small amount of code.
Compiling things is just building a DLL that links against the right library (or libraries, as required). While extensions are usually recommended to link against the stub library, it's not necessary when you're just developing and testing on one machine. But if you're linking against the Tcl DLL, you'd better make sure that the code gets loaded into a tclsh that uses that DLL. Stub libraries get rid of that tight binding, providing pretty strong ABI stability, but are little more work to set up; you need to define the right C macro to turn them on and you need to do an extra API call in your initialisation function.
I assume you already know how to compile and link C++ code. I won't tell you how to do it, but there's bound to be other questions here on Stack Overflow if you need assistance.
Using the code? For an extension, it's basically just:
# Dynamically load the DLL and call the init function
load /path/to/your.dll
# Commands are all present, so use them
NewCommand 3
There are some extra steps later on to turn a DLL into a proper Tcl package, abstracting code that uses the DLL away from the fact that it is exactly that DLL and so on, but they're not something to worry about until you've got things working a lot more.
I have a working python wrapper for C++ code (as suggested here Calling C/C++ from python?
) using ctypes. But the problem is with main function of the code. When I do something like
extern "C" {
void call_main(){ main();}
}
in my c++ code and then call this function via python wrapper
...
lib = cdll.lib('./mylib.so')
def run():
lib.call_main()
-> I get "segmentation fault".
The funny part is that when i copy paste my main method code into function called e.g. test (so it is int test() {....#pasted code...} in c++ code), extern it and then call lib.test()
=> And eveything works fine... So it must be a problem with the main function being called main or something
In C++ calling main() recursively is not allowed ( see 3.6.1, basic.start.main, paragraph 3). Also, you need a C++ aware entry point when you want to call C++ functionality. You can sometimes get away with calling C++ functionality without this but what is going to work and what is not isn't entirely straight forward. The obvious problem is with global objects needing initialization.
Just put the code you want to call into a different function and call this.
Say, for example, I have an Objective-C compiled Module that contains something like the following:
typedef bool (^BoolBlock)(void);
BoolBlock returnABlock(void)
{
return Block_copy(^bool(void){
printf("Block executing.\n");
return YES;
});
}
...then, using the LLVM C++ API, I load that Module and create a CallInst to call the returnABlock() function:
Function *returnABlockFunction = returnABlockModule->getFunction(std::string("returnABlock"));
CallInst *returnABlockCall = CallInst::Create(returnABlockFunction, "returnABlockCall", entryBlock);
How can I then invoke the Block returned via the returnABlockCall object?
Not an easy answer here, I'm afraid. Blocks are lowered by the front-end into calls into the blocks runtime. In the case of clang, the relevant code is at clang/lib/CodeGen/CGBlocks.[h|cpp].
It would be worth asking on the cfe-dev list if there's a way to factor this code out for reuse in other front-ends.
In C, I just act as if the var I assigned the block to was a function pointer. Using your code as an example, after you assign the result of the function to "returnABlockCall", you could just write:
returnABlockCall();
and it should work.
Warning, this is untested in C++, but I see no reason why it wouldn't work.