LLVM JIT support for caching compiled output - llvm

I am trying to see how to avoid LLVM JIT compilation every time and use the cached copy. I see that LLVM has ObjectCache support for code generation from module, but to get module from a file or code string, it needs to be compiled and go through different optimization passes. What is the best way to go about it?
Cache the final image object to some file and first lookup the file, and try to parse and try to create ExecutionEngine with the image so that one can execute (get pointer to function and call it)
Save the intermediate output of code compilation and optimization - i.e. write a module to some file (e.g., using dump) and try to read it (parse IR). then use ObjectCache support for code generation from this module.
Option (2) seems two steps and likely worse than (1), but is (1) the right way to go about it?

Given that you have an instance of ObjectFile you can write it to the disk:
std::string cacheName("some_name.o");
std::error_code EC;
raw_fd_ostream outfile(cacheName, EC, sys::fs::F_None);
outfile.write(object.getBinary()->getMemoryBufferRef().getBufferStart(),
object.getBinary()->getMemoryBufferRef().getBufferSize());
outfile.close();
Then you can read it back from disk:
std::string cacheName("some_name.o");
ErrorOr<std::unique_ptr<MemoryBuffer>> buffer =
MemoryBuffer::getFile(cacheName.c_str());
if (!buffer) {
// handle error
}
Expected<std::unique_ptr<ObjectFile>> objectOrError =
ObjectFile::createObjectFile(buffer.get()->getMemBufferRef());
if (!objectOrError) {
// handle error
}
std::unique_ptr<ObjectFile> objectFile(std::move(objectOrError.get()));
auto owningObject = OwningBinary<ObjectFile>(std::move(objectFile),
std::move(buffer.get()));
auto object = owningObject.getBinary();
You can take this code and plug it in your custom ObjectCache, then feed the object cache to the JIT engine.
I hope it helps.

Related

How to attach debug information into an instruction in a LLVM Pass

I am trying to collect some information from my LLVM optimization pass during runtime. In other words, I want to know the physical address of a specific IR instruction after compilation. So my idea is to convert the LLVM metadata into LLVM DWARF data that can be used during runtime. Instead of attaching the filename and line numbers, I want to attach my own information. My question falls into two parts:
Here is a code that can get the Filename and Line number of an instruction:
if (DILocation *Loc = I->getDebugLoc()) { // Here I is an LLVM instruction
unsigned Line = Loc->getLine();
StringRef File = Loc->getFilename();
StringRef Dir = Loc->getDirectory();
bool ImplicitCode = Loc->isImplicitCode();
}
But How can I set this fields? I could not find a relevant function.
How can I see the updated Debug Information during (filename and line numbers) runtime? I used -g for compiling but still I do not see the Debug Information.
Thanks
The function you need it setDebugLoc() and the info is only included in the result if you include enough of it. The module verifier will tell you what you're missing. These two lines might also be what's tripping you up.
module->addModuleFlag(Module::Warning, "Dwarf Version", dwarf::DWARF_VERSION);
module->addModuleFlag(Module::Warning, "Debug Info Version", DEBUG_METADATA_VERSION);

LLVM how to get callsite file name and line number

I am very very new to LLVM, and it's my first time to write C++
I need to find several function info related to LLVM CallSite, however, I have checked the source code here: LLVM CallSite Source Code
Still don't know where to get call site file name (eg. CallSite is in example.c file), call site line number (eg. at line 18 in the whole program)
Do you know how can I get call site file name and line number?
You can get this information by retrieving debug information from the called function. The algorithm is the following:
You need to get underlying called value, which is a function.
Then you need to get debug information attached to that function.
The debug information should contain everything you need.
Here is a code that should do the job (I didn't run it though):
CallSite cs = ...;
if (!cs.isCall() && !cs.isInvoke()) {
break;
}
Function *calledFunction = dyn_cast<Function>(cs.getCalledValue());
if (!calledFunction) {
break;
}
MDNode *metadata = calledFunction->getMetadata(0);
if (!metadata) {
break;
}
DILocation *debugLocation = dyn_cast<DILocation>(metadata);
if (debugLocation) {
debugLocation->getFilename();
debugLocation->getLine();
}
Please note the breaks. They are here to show that every step may not succeed, so you should be ready to handle all such cases.

generate machine code directly via LLVM API

With the following code, I can generate an LLVM bitcode file from a module:
llvm::Module * module;
// fill module with code
module = ...;
std::error_code ec;
llvm::raw_fd_ostream out("anonymous.bc", ec, llvm::sys::fs::F_None);
llvm::WriteBitcodeToFile(module, out);
I can then use that bitcode file to generate an executable machine code file, e.g.:
clang -o anonymous anonymous.bc
Alternatively:
llc anonymous.bc
gcc -o anonymous anonymous.s
My question now is: Can I generate the machine code directly in C++ with the LLVM API without first needing to write the bitcode file?
I am looking for either a code example or at least some starting points in the LLVM API, e.g. which classes to use, nudging me in the right direction might even be enough.
I was also looking for the code for this, and #arrowd's suggestion worked.
To save the trouble for the next person, this is what I came up with.
Given a Module, it generates assembly code on stdout for your native target:
void printASM(Module *M) {
InitializeNativeTarget();
InitializeNativeTargetAsmPrinter();
auto TargetTriple = sys::getDefaultTargetTriple();
M->setTargetTriple(TargetTriple);
std::string Error;
const Target *target = TargetRegistry::lookupTarget(TargetTriple, Error);
auto cpu = sys::getHostCPUName();
SubtargetFeatures Features;
StringMap<bool> HostFeatures;
if (sys::getHostCPUFeatures(HostFeatures))
for (auto &F : HostFeatures)
Features.AddFeature(F.first(), F.second);
auto features = Features.getString();
TargetOptions Options;
std::unique_ptr<TargetMachine> TM{
target->createTargetMachine(
TargetTriple, cpu, features, Options,
Reloc::PIC_, None, CodeGenOpt::None)
};
legacy::PassManager PM;
M->setDataLayout(TM->createDataLayout());
TM->addPassesToEmitFile(PM, (raw_pwrite_stream &) outs(), (raw_pwrite_stream *) (&outs()),
TargetMachine::CodeGenFileType::CGFT_AssemblyFile, true, nullptr);
PM.run(*M);
}
If anyone knows a shorter way to write this code, feel free to correct me!
Take a look at llc tool source, spcifically compileModule() function. In short, it creates Target, sets some options for it via TargetOptions, then uses it to addPassesToEmitFile() and finally asks PassManager to perform all planned tasks.

llvm.stackprotect of LLVM

I just get started with LLVM. I am reading the code for stack protection which is located in lib/CodeGen/StackProtector.cpp. In this file, the InsertStackProtectors function will insert a call to llvm.stackprotect to the code:
// entry:
// StackGuardSlot = alloca i8*
// StackGuard = load __stack_chk_guard
// call void #llvm.stackprotect.create(StackGuard, StackGuardSlot)
// ...(Skip some lines)
CallInst::
Create(Intrinsic::getDeclaration(M, Intrinsic::stackprotector),
Args, "", InsPt);
This llvm.strackprotect(http://llvm.org/docs/LangRef.html#llvm-stackprotector-intrinsic) seems to be an intrinsic function of llvm, so I tried to find the source code of this function. However, I cannot find it...
I do find one line definition of this function in include/llvm/IR/Intrinsics.td, but it does not tell how it is implemented.
So my questions are:
Where can I find the code for this llvm.strackprotect function?
What is the purpose of these *.td files?
Thank you very much!
The .td file is LLVM's use of code-generation to reduce the amount of boilerplate code. In this particular case, ./include/llvm/IR/Intrinsics.gen is generated in the build directory and contains code describing the intrinsics specified in the .td file.
As for stackprotector, there's a bunch of code in the backend for handling it. See for instance lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp - in SelectionDAGBuilder::visitIntrinsicCall it generates the actual DAG nodes that implement this intrinsic

Is it possible to call Matlab from QtCreator?

I am doing image analysis using C++ in the QtCreator environment. In order to build a learning model, I want to use the TreeBagger class from MATLAB, which is really powerful. Can I call MATLAB from QtCreator, give it some parameters, and get back the classification error? Can I do this without using mex files?
From QProcess's Synchronous Process API example:
QProcess gzip;
gzip.start("gzip", QStringList() << "-c");
if (!gzip.waitForStarted())
return false;
gzip.write("Qt rocks!");
gzip.closeWriteChannel();
if (!gzip.waitForFinished())
return false;
QByteArray result = gzip.readAll();
The concept to from this example is the process of being able to execute matlab w/ whatever settings that would be preferable and begin writing a script to it immediately. After the write; you can close the channel, wait for response, then read the results from matlab. Uunfortunately, I'm not experienced w/ it to provide a more direct example, but this is the concept for the most case. Please research the documentation for anything else.
Matlab has an "engine" interface described here to let standalone programs call matlab functions. It has the advantage that you can call engPutVariableand engGetVariable to transfer your data in binary format (I think it works by using shared memory between your process and matlab, but I'm not sure on this), so you don't have to convert your data to ascii and parse the result from ascii.
For c++ you might want to write a wrapper class for RAII or have a look at http://www.codeproject.com/Articles/4216/MATLAB-Engine-API, where this has already been done.