Adding an Object File to JIT and calling it from IR code - c++

I created a modified version of HowToUseJIT.cpp (llvm version 11.x) that uses IRBuilder class to build a function that calls an external defined in an shared object file.
This example works fine (on my system) when the external has an int argument and return value, but it fails when the argument and return value are double.
The Source for the int case is included below. In addition, the source has instructions, at the top, for transforming it to the double case.
What is wrong with the double version of this example ?
/*
This file is a modified version of the llvm 11.x example HowToUseJIT.cpp:
The file callee.c contains the following text:
int callee(int arg)
{ return arg + 1; }
The shared library callee.so is created from callee.c as follows:
clang -shared callee.c -o callee.so
This example calls the funciton callee from a function that is generated using
the IRBuilder class. It links callee by loading callee.so into its LLJIT.
This works on my sytesm where the progam output is
add1(42) = 43
which is correct.
If I change the type of the function callee from "int (*)(int)" to
"double (*)(double)", the program output is
add1(42) = 4.200000e+01
which is incorrect.
I use following command to change callee.c so that it uses double:
sed -i callee.c \
-e 's|int callee(int arg)|double callee(double arg)|' \
-e 's|return arg + 1;|return arg + 1.0;|'
I use the following command to change this file so that it should porperly
link to the double version of callee:
sed -i add_obj2jit.cpp \
-e '30,$s|"int"|"double"|' \
-e '30,$s|getInt32Ty|getDoubleTy|g' \
-e '/getAddress/s|int|double|g' \
-e 's|int Result = Add1(42);|double Result = Add1(42.0);|
What is wrong with the double version of this example ?
*/
#include "llvm/ExecutionEngine/Orc/LLJIT.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/InitLLVM.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
using namespace llvm::orc;
ExitOnError ExitOnErr;
// --------------------------------------------------------------------------
void add_obj2jit(LLJIT* jit, const std::string filename)
{ // load object file into memory_buffer
ErrorOr< std::unique_ptr<MemoryBuffer> > error_or_buffer =
MemoryBuffer::getFile(filename);
std::error_code std_error_code = error_or_buffer.getError();
if( std_error_code )
{ std::string msg = "add_obj2jit: " + filename + "\n";
msg += std_error_code.message();
std::fprintf(stderr, "%s\n", msg.c_str() );
std::exit( std_error_code.value() );
}
std::unique_ptr<MemoryBuffer> memory_buffer(
std::move( error_or_buffer.get() )
);
// move object file into jit
Error error = jit->addObjectFile( std::move(memory_buffer) );
if( error )
{ std::fprintf(stderr, "Can't load object file %s", filename.c_str());
std::exit(1);
}
}
// --------------------------------------------------------------------------
ThreadSafeModule createDemoModule() {
auto Context = std::make_unique<LLVMContext>();
auto M = std::make_unique<Module>("test", *Context);
// functiont_t
// function has a return type of "int" and take an argument of "int".
FunctionType* function_t = FunctionType::get(
Type::getInt32Ty(*Context), {Type::getInt32Ty(*Context)}, false
);
// declare the callee function
AttributeList empty_attributes;
FunctionCallee callee = M->getOrInsertFunction(
"callee", function_t, empty_attributes
);
// Create the add1 function entry and insert this entry into module M.
Function *Add1F = Function::Create(
function_t, Function::ExternalLinkage, "add1", M.get()
);
// Add a basic block to the function. As before, it automatically inserts
// because of the last argument.
BasicBlock *BB = BasicBlock::Create(*Context, "EntryBlock", Add1F);
// Create a basic block builder with default parameters. The builder will
// automatically append instructions to the basic block `BB'.
IRBuilder<> builder(BB);
// Get pointers to the integer argument of the add1 function...
assert(Add1F->arg_begin() +1 == Add1F->arg_end()); // Make sure there's an arg
Argument *ArgX = &*Add1F->arg_begin(); // Get the arg
ArgX->setName("AnArg"); // Give it a nice symbolic name for fun.
// Create the call instruction, inserting it into the end of BB.
Value *Add = builder.CreateCall( callee, {ArgX}, "Add=callee(ArgX)" );
// Create the return instruction and add it to the basic block
builder.CreateRet(Add);
return ThreadSafeModule(std::move(M), std::move(Context));
}
// --------------------------------------------------------------------------
int main(int argc, char *argv[]) {
// Initialize LLVM.
InitLLVM X(argc, argv);
InitializeNativeTarget();
InitializeNativeTargetAsmPrinter();
cl::ParseCommandLineOptions(argc, argv, "add_obj2jit");
ExitOnErr.setBanner(std::string(argv[0]) + ": ");
// Create an LLJIT instance.
auto J = ExitOnErr(LLJITBuilder().create());
auto M = createDemoModule();
ExitOnErr(J->addIRModule(std::move(M)));
add_obj2jit(J.get(), "callee.so");
// Look up the JIT'd function, cast it to a function pointer, then call it.
auto Add1Sym = ExitOnErr(J->lookup("add1"));
int (*Add1)(int) = (int (*)(int))Add1Sym.getAddress();
int Result = Add1(42);
outs() << "add1(42) = " << Result << "\n";
// return error number
if( Result != 43 )
return 1;
return 0;
}

Andrea:
Thanks for asking to see the IR outupt. Changing the example code line
// llvm::outs() << *M;
to the line
lvm::outs() << *M;
generates this output.
Looking at the output is was clear to me that second sed command had failed.
This was because it was missing a single quote at the end.
When I fixed this, the double case worked. Here is the outptut, including the IR, for the the int case:
; ModuleID = 'test'
source_filename = "test"
declare i32 #callee(i32)
define i32 #add1(i32 %AnArg) {
EntryBlock:
%0 = call i32 #callee(i32 %AnArg)
ret i32 %0
}
add1(42) = 43
Here is the output for the double case:
; ModuleID = 'test'
source_filename = "test"
declare double #callee(double)
define double #add1(double %AnArg) {
EntryBlock:
%0 = call double #callee(double %AnArg)
ret double %0
}
add1(42) = 4.300000e+01

Related

C++, efficient way to call many possible functions from user

I'm relatively new to c++, mostly worked with python.
I have a scenario where a user(me) uses a GUI to send commands to a microcontroller via serial, and then the microcontroller processes them.
Right now i have 10 commands, but as the project develops (some form of modular robot) I can envision having 50-100 possible commands.
Is there a better way for my c++ handleCommands function to select which one of the possible 100 functions to run without doing massive case switches or if else statements?
Extract of the code:
char cmd = 1; // example place holder
int value = 10; //example place holder
switch (cmd){
case '1':
toggleBlink(value);
break;
case '2':
getID(value); // in this case value gets ignored by the function as its not required
break;
This works fine for 3-4 functions but doesn't seem to me like the best way to do it for more functions.
I've heard of lookup tables but as each function is different and may require arguments or not I'm consumed on how to implement them.
Some background on the set-up:
The commands are mainly diagnostic ,< ID > ect and a couple of functional ones that require parameters like, <blink,10> <runto,90> <set-mode,locked>
The validation is done in python against a csv file and the actual serial message sent to the microcontroller is sent as <(index of comand in csvfile),parameter> with < > and , being delimiters.
So the user would type blink,10 and the python app will send <1,10> over serial as blink is found at index 1 of the csv file.
The microcontroller reads these in and i am left over with 2 char arrays, the command array containing a number, and the value array containing the value sent.(also a number)
As I'm running this on a microcontroller i don't really want to have to store a long file of possible commands in flash, hence the validation done on the python gui side.
Note that in the case of a possible multi argument function, say <move,90,30> i.e move 90 degrees in 30 seconds eat, the actual function would only receive one argument "30,90" and then split that up as needed.
If you have the commands comming over the serial line in the format
<command-mapped-to-a-number,...comma-separated-parameters...>
we can simulate that like so:
#include <iostream>
#include <sstream> // needed for simple parsing
#include <string>
#include <unordered_map> // needed for mapping of commands to functors
int main() {
std::cout << std::boolalpha;
// example commands lines read from serial:
for (auto& cmdline : {"<1,10>", "<2,10,90>", "<3,locked>", "<4>"}) {
std::cout << exec(cmdline) << '\n';
}
}
exec above is the interpreter that will return true if the command line was parsed and executed ok. In the examples above, command 1 takes one parameter, 2 takes two, 3 takes one (string) and 4 doesn't have a parameter.
The mapping from command-mapped-to-a-number could be an enum:
// uint8_t has room for 256 commands, make it uint16_t to get room for 65536 commands
enum class command_t : uint8_t {
blink = 1,
take_two = 2,
set_mode = 3,
no_param = 4,
};
and exec would make the most basic validation of the command line (checking < and >) and put it in a std::istringstream for easy extraction of the information on this command line:
bool exec(const std::string& cmdline) {
if(cmdline.size() < 2 || cmdline.front() != '<' || cmdline.back() != '>' )
return false;
// put all but `<` and `>` in an istringstream:
std::istringstream is(cmdline.substr(1,cmdline.size()-2));
// extract the command number
if (int cmd; is >> cmd) {
// look-up the command number in an `unordered_map` that is mapped to a functor
// that takes a reference to an `istringstream` as an argument:
if (auto cit = commands.find(command_t(cmd)); cit != commands.end()) {
// call the correct functor with the rest of the command line
// so that it can extract, validate and use the arguments:
return cit->second(is);
}
return false; // command look-up failed
}
return false; // command number extraction failed
}
The only tricky part left is the unordered_map of commands and functors.
Here's a start:
// a helper to eat commas from the command line
struct comma_eater {} comma;
std::istream& operator>>(std::istream& is, const comma_eater&) {
// next character must be a comma or else the istream's failbit is set
if(is.peek() == ',') is.ignore();
else is.setstate(std::ios::failbit);
return is;
}
std::unordered_map<command_t, bool (*)(std::istringstream&)> commands{
{command_t::blink,
[](std::istringstream& is) {
if (int i; is >> comma >> i && is.eof()) {
std::cout << "<blink," << i << "> ";
return true;
}
return false;
}},
{command_t::take_two,
[](std::istringstream& is) {
if (int a, b; is >> comma >> a >> comma >> b && is.eof()) {
std::cout << "<take-two," << a << ',' << b << "> ";
return true;
}
return false;
}},
{command_t::set_mode,
[](std::istringstream& is) {
if (std::string mode; is >> comma && std::getline(is, mode,',') && is.eof()) {
std::cout << "<set-mode," << mode << "> ";
return true;
}
return false;
}},
{command_t::no_param,
[](std::istringstream& is) {
if (is.eof()) {
std::cout << "<no-param> ";
return true;
}
return false;
}},
};
If you put that together you'll get the below output from the successful parsing (and execution) of all command lines received:
<blink,10> true
<take-two,10,90> true
<set-mode,locked> true
<no-param> true
Here's a live demo.
Given an integer index for each "command" a simple function pointer look-up table can be used. For example:
#include <cstdio>
namespace
{
// Command functions (dummy examples)
int examleCmdFunctionNoArgs() ;
int examleCmdFunction1Arg( int arg1 ) ;
int examleCmdFunction2Args( int arg1, int arg2 ) ;
int examleCmdFunction3Args( int arg1, int arg2, arg3 ) ;
int examleCmdFunction4Args( int arg1, int arg2, int arg3, int arg4 ) ;
const int MAX_ARGS = 4 ;
const int MAX_CMD_LEN = 32 ;
typedef int (*tCmdFn)( int, int, int, int ) ;
// Symbol table
#define CMD( f ) reinterpret_cast<tCmdFn>(f)
static const tCmdFn cmd_lookup[] =
{
0, // Invalid command
CMD( examleCmdFunctionNoArgs ),
CMD( examleCmdFunction1Arg ),
CMD( examleCmdFunction2Args ),
CMD( examleCmdFunction3Args ),
CMD( examleCmdFunction4Args )
} ;
}
namespace cmd
{
// For commands of the form: "<cmd_index[,arg1[,arg2[,arg3[,arg4]]]]>"
// i.e an angle bracketed comma-delimited sequence commprising a command
// index followed by zero or morearguments.
// e.g.: "<1,123,456,0>"
int execute( const char* command )
{
int ret = 0 ;
int argv[MAX_ARGS] = {0} ;
int cmd_index = 0 ;
int tokens = std::sscanf( "<%d,%d,%d,%d,%d>", command, &cmd_index, &argv[0], &argv[1], &argv[2], &argv[3] ) ;
if( tokens > 0 && cmd_index < sizeof(cmd_lookup) / sizeof(*cmd_lookup) )
{
if( cmd_index > 0 )
{
ret = cmd_lookup[cmd_index]( argv[0], argv[1], argv[2], argv[3] ) ;
}
}
return ret ;
}
}
The command execution passes four arguments (you can expand that as necessary) but for command functions taking fewer arguments they will simply be "dummy" arguments that will be ignored.
Your proposed translation to an index is somewhat error prone and maintenance heavy since it requires you to maintain both the PC application symbol table and the embedded look up table in sync. It may not be prohibitive to have the symbol table on the embedded target; for example:
#include <cstdio>
#include <cstring>
namespace
{
// Command functions (dummy examples)
int examleCmdFunctionNoArgs() ;
int examleCmdFunction1Arg( int arg1 ) ;
int examleCmdFunction2Args( int arg1, int arg2 ) ;
int examleCmdFunction3Args( int arg1, int arg2, arg3 ) ;
int examleCmdFunction4Args( int arg1, int arg2, int arg3, int arg4 ) ;
const int MAX_ARGS = 4 ;
const int MAX_CMD_LEN = 32 ;
typedef int (*tCmdFn)( int, int, int, int ) ;
// Symbol table
#define SYM( c, f ) {#c, reinterpret_cast<tCmdFn>(f)}
static const struct
{
const char* symbol ;
const tCmdFn command ;
} symbol_table[] =
{
SYM( cmd0, examleCmdFunctionNoArgs ),
SYM( cmd1, examleCmdFunction1Arg ),
SYM( cmd2, examleCmdFunction2Args ),
SYM( cmd3, examleCmdFunction3Args ),
SYM( cmd4, examleCmdFunction4Args )
} ;
}
namespace cmd
{
// For commands of the form: "cmd[ arg1[, arg2[, arg3[, arg4]]]]"
// i.e a command string followed by zero or more comma-delimited arguments
// e.g.: "cmd3 123, 456, 0"
int execute( const char* command_line )
{
int ret = 0 ;
int argv[MAX_ARGS] = {0} ;
char cmd[MAX_CMD_LEN + 1] ;
int tokens = std::sscanf( "%s %d,%d,%d,%d", command_line, cmd, &argv[0], &argv[1], &argv[2], &argv[3] ) ;
if( tokens > 0 )
{
bool cmd_found = false ;
for( int i = 0;
!cmd_found && i < sizeof(symbol_table) / sizeof(*symbol_table);
i++ )
{
cmd_found = std::strcmp( cmd, symbol_table[i].symbol ) == 0 ;
if( cmd_found )
{
ret = symbol_table[i].command( argv[0], argv[1], argv[2], argv[3] ) ;
}
}
}
return ret ;
}
}
For very large symbol tables you might want a more sophisticated look-up, but depending on the required performance and determinism, the simple exhaustive search will be sufficient - far faster than the time taken to send the serial data.
Whilst the resource requirement for the symbol table is somewhat higher that the indexed look-up, it is nonetheless ROM-able and will can be be located in Flash memory which on most MCUs is a less scarce resource than SRAM. Being static const the linker/compiler will most likely place the tables in ROM without any specific directive - though you should check the link map or the toolchain documentation n this.
In both cases I have defined the command functions and executer as returning an int. That is optional of course, but you might use that for returning responses to the PC issuing the serial command.
What you are talking about are remote procedure calls. So you need to have some mechanism to serialize and un-serialize the calls.
As mentioned in the comments you can make a map from cmd to the function implementing the command. Or simply an array. But the problem remains that different functions will want different arguments.
So my suggestion would be to add a wrapper function using vardiac templates.
Prefix every command with the length of data for the command so the receiver can read a block of data for the command and knows when to dispatch it to a function. The wrapper then takes the block of data, splits it into the right size for each argument and converts it and then calls the read function.
Now you can make a map or array of those wrapper function, each one bound to one command and the compiler will generate the un-serialize code for you from the types. (You still have to do it once for each type, the compiler only combines those for the full function call).

Writing a LLVM transformation pass to inject delays at the beginning of each function

I am new to LLVM, I am trying to write an LLVM transformation pass that will inject a delay to the beginning of each called function at run time.
I found the following code that injects a printf statement to the beginning of each function.
How can i change the code accordingly to inject a delay instead of the printf? (I am using LLVM 10.)
Below is the code:
bool InjectFuncCall::runOnModule(Module &M) {
bool InsertedAtLeastOnePrintf = false;
auto &CTX = M.getContext();
PointerType *PrintfArgTy = PointerType::getUnqual(Type::getInt8Ty(CTX));
// STEP 1: Inject the declaration of printf
// ----------------------------------------
// Create (or _get_ in cases where it's already available) the following
// declaration in the IR module:
// declare i32 #printf(i8*, ...)
// It corresponds to the following C declaration:
// int printf(char *, ...)
FunctionType *PrintfTy = FunctionType::get(
IntegerType::getInt32Ty(CTX),
PrintfArgTy,
/*IsVarArgs=*/true);
FunctionCallee Printf = M.getOrInsertFunction("printf", PrintfTy);
// Set attributes as per inferLibFuncAttributes in BuildLibCalls.cpp
Function *PrintfF = dyn_cast<Function>(Printf.getCallee());
PrintfF->setDoesNotThrow();
PrintfF->addParamAttr(0, Attribute::NoCapture);
PrintfF->addParamAttr(0, Attribute::ReadOnly);
// STEP 2: Inject a global variable that will hold the printf format string
// ------------------------------------------------------------------------
llvm::Constant *PrintfFormatStr = llvm::ConstantDataArray::getString(
CTX, "(llvm-tutor) Hello from: %s\n(llvm-tutor) number of arguments: %d\n");
Constant *PrintfFormatStrVar =
M.getOrInsertGlobal("PrintfFormatStr", PrintfFormatStr->getType());
dyn_cast<GlobalVariable>(PrintfFormatStrVar)->setInitializer(PrintfFormatStr);
// STEP 3: For each function in the module, inject a call to printf
// ----------------------------------------------------------------
for (auto &F : M) {
if (F.isDeclaration())
continue;
// Get an IR builder. Sets the insertion point to the top of the function
IRBuilder<> Builder(&*F.getEntryBlock().getFirstInsertionPt());
// Inject a global variable that contains the function name
auto FuncName = Builder.CreateGlobalStringPtr(F.getName());
// Printf requires i8*, but PrintfFormatStrVar is an array: [n x i8]. Add
// a cast: [n x i8] -> i8*
llvm::Value *FormatStrPtr =
Builder.CreatePointerCast(PrintfFormatStrVar, PrintfArgTy, "formatStr");
// The following is visible only if you pass -debug on the command line
// *and* you have an assert build.
LLVM_DEBUG(dbgs() << " Injecting call to printf inside " << F.getName()
<< "\n");
// Finally, inject a call to printf
Builder.CreateCall(
Printf, {FormatStrPtr, FuncName, Builder.getInt32(F.arg_size())});
InsertedAtLeastOnePrintf = true;
}
return InsertedAtLeastOnePrintf;
}
Also it would be great if there are links for good LLVM tutorials for beginners.
You'll have to declare the delay function the same way you declared printf except you'll want to change the argument type from i8* to i32. For the tutorials you could check these out
https://anoopsarkar.github.io/compilers-class/llvm-practice.html\
https://www.usna.edu/Users/cs/wcbrown/courses/F19SI413/lab/l13/lab.html
https://osterlund.xyz/posts/2017-11-28-LLVM-pass.html

How to handle type errors when working with blobs in SQLite?

Is there a good way to handle type errors when working with blobs in SQLite? For example, the following code registers two functions create_vector and display_vector. Basically, create_vector stores a std::vector as a blob and display_vector converts this blob into text, so that we can see it:
/* In order to use
sqlite> .load "./blob.so"
sqlite> select display_vector(create_vector());
[ 1.200000, 3.400000, 5.600000, 7.800000, 9.100000 ]
*/
#include <string>
#include <sqlite3ext.h>
SQLITE_EXTENSION_INIT1
extern "C" {
int sqlite3_blob_init(
sqlite3 * db,
char ** err,
sqlite3_api_routines const * const api
);
}
// Cleanup handler that deletes an array
template <typename T>
void array_cleanup(void * v) {
delete [] static_cast <T *> (v);
}
// Creates and returns a std::vector as a blob
static void create_vector(
sqlite3_context *context,
int argc,
sqlite3_value **argv
){
// Create a dummy vector
auto * v = new double[5] {1.2,3.4,5.6,7.8,9.10};
// Either cleanup works
sqlite3_result_blob(context,v,sizeof(double[5]),array_cleanup <double>);
}
// Converts a std::vector into text
static void display_vector(
sqlite3_context *context,
int argc,
sqlite3_value **argv
){
// Grab the vector. Note, if this is not a vector, then sqlite will
// almost certainly segfault.
auto const * const v =static_cast <double const * const> (
sqlite3_value_blob(argv[0]));
// Assuming we have a vector, convert it into a string
auto s = std::string("[ ");
for(unsigned i=0;i<5;i++) {
// If we're not on the first element, add a comma
if(i>0) s += ", ";
// Add the number
s += std::to_string(v[i]);
}
s += " ]";
// Return the text
sqlite3_result_text(
context,sqlite3_mprintf("%s",s.c_str()),s.size(),sqlite3_free);
}
// Register our blob functions
int sqlite3_blob_init(
sqlite3 *db,
char **err,
sqlite3_api_routines const * const api
){
SQLITE_EXTENSION_INIT2(api)
// Register the create_vector function
if( int ret = sqlite3_create_function(
db, "create_vector", 0, SQLITE_ANY, 0, create_vector, 0, 0)
) {
*err=sqlite3_mprintf("Error registering create_vector: %s",
sqlite3_errmsg(db));
return ret;
}
// Register the display_vector function
if( int ret = sqlite3_create_function(
db, "display_vector", 1, SQLITE_ANY, 0, display_vector, 0, 0)
) {
*err=sqlite3_mprintf("Error registering display_vector: %s",
sqlite3_errmsg(db));
return ret;
}
// If we've made it this far, we should be ok
return SQLITE_OK;
}
We can compile this with:
$ make
g++ -g -std=c++14 blob.cpp -shared -o blob.so -fPIC
Now, if we use these functions as advertised, everything works fine:
sqlite> .load "./blob.so"
sqlite> select display_vector(create_vector());
[ 1.200000, 3.400000, 5.600000, 7.800000, 9.100000 ]
However, if we try to use display_vector on a non-vector, we segfault:
sqlite> .load "./blob.so"
sqlite> select display_vector(NULL);
Segmentation fault
Really, the issue is that the static_cast in display_vector vector is not correct. In any case, is there a good way check the type of the blob or even guarantee that we have a blob? Is there a good way to prevent a segfault when a new extension requires an input of a certain type?
A blob is just a bunch of bytes, and not every value is a blob.
Your function should check the value's type with sqlite3_value_type(), and check the length with sqlite3_value_bytes().

How to know written var type with Clang using C API instead of actual?

I'm trying to use Clang via C API, indexing to be detailed. The problem is that some types are returned not as they are written, but as they are for compiler. For example "Stream &" becomes "int &" and "byte" becomes "int.
Some test lib:
// TODO make it a subclass of a generic Serial/Stream base class
class FirmataClass
{
public:
FirmataClass(Stream &s);
void setFirmwareNameAndVersion(const char *name, byte major, byte minor);
I'm using the code to get method information:
void showMethodInfo(const CXIdxDeclInfo *info) {
int numArgs = clang_Cursor_getNumArguments(info->cursor);
fprintf(stderr, " %i args:\n", numArgs);
for (int i=0; i<numArgs; i++) {
CXCursor argCursor = clang_Cursor_getArgument(info->cursor, i);
CXString name = clang_getCursorDisplayName(argCursor);
CXString spelling = clang_getCursorSpelling(argCursor);
CXType type = clang_getCursorType(argCursor);
CXString typeSpelling = clang_getTypeSpelling(type);
CXCursorKind kind = clang_getCursorKind(argCursor);
fprintf(stderr, " kind=[%s (%i)], type=[%s], spelling=[%s]\n",
cursor_kinds[kind], kind, clang_getCString(typeSpelling),
clang_getCString(spelling));
clang_disposeString(name);
clang_disposeString(spelling);
clang_disposeString(typeSpelling);
}
// return type
CXType returnType = clang_getCursorResultType(info->cursor);
CXString returnTypeSpelling = clang_getTypeSpelling(returnType);
fprintf(stderr, " returns %s\n", clang_getCString(returnTypeSpelling));
clang_disposeString(returnTypeSpelling);
}
Output:
[105:10 4689] access=[CX_CXXPublic]
kind=[CXIdxEntity_CXXInstanceMethod] (21)
name=[setFirmwareNameAndVersion] is_container=[0] 3 args:
kind=[CXCursor_ParmDecl (10)], type=[const char *], spelling=[name]
kind=[CXCursor_ParmDecl (10)], type=[int], spelling=[major]
kind=[CXCursor_ParmDecl (10)], type=[int], spelling=[minor]
returns void
So you can see that byte function arguments are described as int.
How can i get actual spelling?
Is byte declared via a typedef, or a #define?
When I declare these types:
typedef int MyType_t;
#define MyType2_t int
class Foo
{
public:
bool bar( MyType_t a, MyType2_t b );
};
And then print the type names I get from clang_GetTypeSpelling this is what I get:
bool Foo_bar( MyType_t a, int b )
Libclang presumably can't print the #defined name because the preprocessor has already replaced it with int by the time the parse tree is built.
I've solved this a few days ago.
"Stream &" becomes "int &" and "byte" becomes "int.
libclang doesn't know what Stream or byte are until you insert the standard headers manualy using the flag -isystem <pathToStdHeaderDirectory>
I wrote a C# function that retrieves all the visual studio VC headers include directory:
private static string[] GetStdIncludes()
{
using (RegistryKey key = Registry.LocalMachine.OpenSubKey(#"SOFTWARE\Wow6432Node\Microsoft\VisualStudio"))
{
if (key != null)
{
var lastVcVersions = key.GetSubKeyNames()
.Select(s =>
{
float result = 0;
if (float.TryParse(s, System.Globalization.NumberStyles.Float, System.Globalization.CultureInfo.InvariantCulture, out result))
return result;
else return 0F;
}).Where(w => w > 0F)
.OrderByDescending(or => or)
.Select(s => s.ToString("n1", System.Globalization.CultureInfo.InvariantCulture))
.ToArray();
foreach (var v in lastVcVersions)
{
using (var vk = key.OpenSubKey(v))
{
var val = (string)vk.GetValue("Source Directories");
if (!string.IsNullOrEmpty(val))
return val.Split(";");
}
}
}
}
throw new Exception("Couldn't find VC runtime include directories");
}
hope that helps
I was having the same issue with my own classes.
You need to pass on the same flags you would use for compiling with clang to either clang_parseTranslationUnit or clang_createTranslationUnit, in particular the -I flags which are used to look up the header files where your class or type definitions are.
it seems that if libclang can't find a type declaration, it just defaults to all of then to int.
calling clang_createIndex ( 1, 1 ) should provide you with hints on what you are missing via stderr.
Here is some sample code that works for me now:
int main ( int argc, char* argv[] )
{
char *clang_args[] =
{
"-I.",
"-I./include",
"-I../include",
"-x",
"c++",
"-Xclang",
"-ast-dump",
"-fsyntax-only",
"-std=c++1y"
};
CXIndex Idx = clang_createIndex ( 1, 1 );
CXTranslationUnit TU = clang_parseTranslationUnit ( Idx, argv[1], clang_args, 9, NULL, 0, CXTranslationUnit_Incomplete | CXTranslationUnit_SkipFunctionBodies );
clang_visitChildren ( clang_getTranslationUnitCursor ( TU ),
TranslationUnitVisitor, NULL );
clang_disposeTranslationUnit ( TU );
return 0;
}
I am trying to get the AST for a header file, hence the CXTranslationUnit_Incomplete | CXTranslationUnit_SkipFunctionBodies flags and -ast-dump -fsyntax-only command line options, you may want to omit them if you dont need them and of course add and change the -I parameters according to your needs.

LLVM JIT segfaults. What am I doing wrong?

It is probably something basic because I am just starting to learn LLVM..
The following creates a factorial function and tries to git and execute it (I know the generated func is correct because I was able to static compile and execute it).
But I get segmentation fault upon execution of the function (in EE->runFunction(TheF, Args))
#include "llvm/Module.h"
#include "llvm/Function.h"
#include "llvm/PassManager.h"
#include "llvm/CallingConv.h"
#include "llvm/Analysis/Verifier.h"
#include "llvm/Assembly/PrintModulePass.h"
#include "llvm/Support/IRBuilder.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/ExecutionEngine/JIT.h"
#include "llvm/ExecutionEngine/GenericValue.h"
using namespace llvm;
Module* makeLLVMModule() {
// Module Construction
LLVMContext& ctx = getGlobalContext();
Module* mod = new Module("test", ctx);
Constant* c = mod->getOrInsertFunction("fact64",
/*ret type*/ IntegerType::get(ctx,64),
IntegerType::get(ctx,64),
/*varargs terminated with null*/ NULL);
Function* fact64 = cast<Function>(c);
fact64->setCallingConv(CallingConv::C);
/* Arg names */
Function::arg_iterator args = fact64->arg_begin();
Value* x = args++;
x->setName("x");
/* Body */
BasicBlock* block = BasicBlock::Create(ctx, "entry", fact64);
BasicBlock* xLessThan2Block= BasicBlock::Create(ctx, "xlst2_block", fact64);
BasicBlock* elseBlock = BasicBlock::Create(ctx, "else_block", fact64);
IRBuilder<> builder(block);
Value *One = ConstantInt::get(Type::getInt64Ty(ctx), 1);
Value *Two = ConstantInt::get(Type::getInt64Ty(ctx), 2);
Value* xLessThan2 = builder.CreateICmpULT(x, Two, "tmp");
//builder.CreateCondBr(xLessThan2, xLessThan2Block, cond_false_2);
builder.CreateCondBr(xLessThan2, xLessThan2Block, elseBlock);
/* Recursion */
builder.SetInsertPoint(elseBlock);
Value* xMinus1 = builder.CreateSub(x, One, "tmp");
std::vector<Value*> args1;
args1.push_back(xMinus1);
Value* recur_1 = builder.CreateCall(fact64, args1.begin(), args1.end(), "tmp");
Value* retVal = builder.CreateBinOp(Instruction::Mul, x, recur_1, "tmp");
builder.CreateRet(retVal);
/* x<2 */
builder.SetInsertPoint(xLessThan2Block);
builder.CreateRet(One);
return mod;
}
int main(int argc, char**argv) {
long long x;
if(argc > 1)
x = atol(argv[1]);
else
x = 4;
Module* Mod = makeLLVMModule();
verifyModule(*Mod, PrintMessageAction);
PassManager PM;
PM.add(createPrintModulePass(&outs()));
PM.run(*Mod);
// Now we going to create JIT
ExecutionEngine *EE = EngineBuilder(Mod).create();
// Call the function with argument x:
std::vector<GenericValue> Args(1);
Args[0].IntVal = APInt(64, x);
Function* TheF = cast<Function>(Mod->getFunction("fact64")) ;
/* The following CRASHES.. */
GenericValue GV = EE->runFunction(TheF, Args);
outs() << "Result: " << GV.IntVal << "\n";
delete Mod;
return 0;
}
Edit:
The correct way to enable JIT (see the accepted answer below):
1.#include "llvm/ExecutionEngine/Jit.h"`
2.InitializeNativeTarget();
I would bet that the ExecutionEngine pointer is null.... You are missing a call to InitializeNativeTarget, the documentation says:
InitializeNativeTarget - The main program should call this function to initialize the native target corresponding to the host. This is useful for JIT applications to ensure that the target gets linked in correctly.
Since there is no JIT compiler available without calling InitializeNativeTarget, ModuleBuilder selects the interpreter (if available). Probably not what you wanted. You may want to look at my previous post on this subject.
#include "llvm/ExecutionEngine/Interpreter.h"
Including that header (llvm/ExecutionEngine/Interpreter.h) forces a static initialisation of the JIT. Not the best design decision, but at least it works.