I’m trying to JIT compile some functions in an existing C/C++ program at runtime, but I’m running into some trouble with global variable initialization. Specifically, the approach I’ve taken is to use Clang to precompile the program into IR bitcode modules in addition to the executable. At runtime, the program loads the modules, transforms them (program specialization), compiles and executes them. As it turns out, I have some global variables that get initialized and modified during execution of the “host” program. Currently, these globals are also getting initialized in the JIT compiled code, whereas I’d like them to be mapped to the host global variables instead. Can someone help me with this?
A small repro is excerpted below. Full source code is here. The file somefunc.cpp gets precompiled during build, and is loaded in the main() function in testCompile.cpp. The global variable xyz is initialized to point to 25 in somefunc.cpp, but I’d like it to point to 10 as in main() instead. In other words, the assertion in main() should succeed.
I tried a few different ways to solve this problem. The ChangeGlobal() function attempts (unsuccessfully) to achieve this updateGlobalMapping(). The second, more hacky approach uses a new global variable initialized appropriately. I can get this latter approach to work for some types of globals, but is there a more elegant approach than this?
//————— somefunc.h ————————
extern int *xyz;
//—————— somefunc.cpp ——————
int abc = 25;
int *xyz = &abc;
int somefunc() {
return *xyz;
}
//—————— testCompile.cpp ——————
class JitCompiler {
public:
JitCompiler(const std::string module_file);
void LoadModule(const std::string& file);
template <typename FnType>
FnType CompileFunc(FnType fn, const std::string& fn_name);
void ChangeGlobal();
private:
std::unique_ptr<LLVMContext> context_;
Module *module_;
std::unique_ptr<ExecutionEngine> engine_;
};
void JitCompiler::ChangeGlobal() {
// ----------------- #1: UpdateGlobalMapping -----------------
//auto g = engine_->FindGlobalVariableNamed("xyz");
//engine_->updateGlobalMapping(g, &xyz);
//assert(engine_->getGlobalValueAddress("xyz") == (uint64_t) &xyz);
// ----------------- #2: Replace with new global ————————
// ------- Ugly hack that works for globals of type T** ----------
auto g = engine_->FindGlobalVariableNamed("xyz");
Constant *addr_i = ConstantInt::get(*context_, APInt(64, (uint64_t) xyz));
auto addr = ConstantExpr::getIntToPtr(
addr_i, g->getType()->getPointerElementType());
GlobalVariable *n = new GlobalVariable(
*module_,
g->getType()->getPointerElementType(),
g->isConstant(),
g->getLinkage(),
addr,
g->getName() + "_new");
g->replaceAllUsesWith(n);
n->takeName(g);
g->eraseFromParent();
}
int main() {
xyz = new int (10);
JitCompiler jit("somefunc.bc");
jit.ChangeGlobal();
auto fn = jit.CompileFunc(&somefunc, "somefunc");
assert(somefunc() == fn());
}
A better approach is the combination of the two you presented, that is, to create a new global with external linkage mapped to &xyz and substitute it for the original:
auto g = engine_->FindGlobalVariableNamed("xyz");
GlobalVariable *n = new GlobalVariable(
g->getType()->getPointerElementType(),
g->isConstant(),
ExternalLinkage
nullptr,
g->getName() + "_new");
engine_->updateGlobalMapping(n, &xyz);
g->replaceAllUsesWith(n);
n->takeName(g);
g->eraseFromParent();
Related
I'm trying to build a system, that uses dynamic loaded library files (.so files) as plugins in C++14. I build the project using gcc in combination with qt5.5.1 within qtcreator.
The problem I'm having is, that I don't fully understand, what dlopen() (and dlsym()) actually does and get strange behavior because of it. Here is a simplified (not executable) Version:
/*Kernel.hpp*/
class Kernel{
int loadPlugins();
}
void* sharedPointer; //The Object location is stored in here
/*Kernel.cpp*/
Kernel::loadPlugins(){
handle1 = dlopen(<file1>, RTLD_LAZY);
init_t init = (init_t) dlsym(handle1, "init"); //init_t is just a fitting function pointer
execute_t exec = (execute_t) dlsym(handle1, "execute"); //same goes for "execute_t"
handle2 = dlopen(<file2>, RTLD_LAZY);
init_t init = (init_t) dlsym(handle2, "init");
execute_t exec = (execute_t) dlsym(handle2, "execute");
}
/*<file1.h>*/
Class Test{
int func();
int field = 0;
}
/*<file1.cpp>*/
int Test::func(){/*do stuff*/}
Test* test = NULL;
extern void* sharedPtr; //use the ptr from kernel
extern "C" init(){
test = new Test();
sharedPtr = (void*)test; //store address of newly created Test-Object
}
extern "C" execute(){
/* Do Stuff */
}
/*<file2.h>*/
/*<file2.cpp>*/
Test* test = NULL;
extern void* sharedPtr; //use the ptr from kernel
extern "C" init(){
test = (Test*)sharedPtr; //get address of Testobject
}
extern "C" execute(){
std::cout << test->field << std::endl; //Working perfectly
std::cout << test->func() << std::endl //Segmentaion fault
}
The precise Error is a symbol lookup error of a member function with some mangled name (unmangled name is Kernel::test()).
What I think, that should happen:
When Kernel.loadPlugins() is called, the first library creates an object, saves its address in the main program. Library reads that address and can use it, as if it had created that object. So the field can be read and written to and the member function can be called.
What actually happens:
When Kernel.loadPlugins() is called, the first library creates said object, can use it as expected, saves its address in the main program. Library receives said address as expected, can use the field of said object as expected (it does not matter what type that field has, even other objects, like strings, worked, valgrind does not show any leaks as well), but when it tries to call func() it produces a segmentation fault.
I have two primary doubts -
First, I would like to know why that happens?
Second, I would like to know if there is a nice way to fix it?
I am using this snippet in one of my classes and I got a behavior, that I cannot explain properly. There are two different outputs, the one works on my simulation with the QtCreator, the other does not work on my embedded system.
Header:
typedef struct
{
int anInt;
const char* aString;
} aStruct;
class Foo
{
...
private:
...
static const aStruct STRUCT_ARRAY[];
static const char* A_STRING;
static const char* ANOTHER_STRING;
}
CPP - Implementation
const char* Foo::A_STRING = "HelloStackOverflow";
const char* Foo::ANOTHER_STRING = "ByeStackOverflow";
const aStruct Foo::STRUCT_ARRAY[] =
{
{ 0, A_STRING},
{ 1, ANOTHER_STRING},
};
If I want to access now the string within my class via printf the application crashes on my embedded machine (printf(" ... %s ...", STRUCT_ARRAY[0].aString). Using this in QtCreator with qDebug << STRUCT_ARRAY[0].aString it totally works fine.
It also works if I replace the const array entry directly with the string A_STING -> "HelloStackOverflow". What am I missing here? In my understanding, the compiler just replaces the the entry within the array with the address the pointer points too.
I am using the Atmel AVR 32-bit toolchain.
/edit: it also crashes with strcat (besides the printf problem)
/edit2: what I tried so far (did not solve it yet)
played around with the const-ness of the elements (especially of the array)
Order of elements at the header
Changed optimizer level (0|1|2|3|s)
I have MCU with flash memory breaked in sections(as usual).
Linker places .struct_init, .struct_init_const, .struct_not_init sections to addresses belongs to flash memory section20. It is hardcoded in linker script.
Consider following test code:
test.h
typedef struct
{
int val1;
int val2;
} mystruct_t;
test.cpp
#include "test.h"
// each variable is placed in dedicated section
// sections are placed in flash section20
// linker exports symbols with address of eaach section
__attribute__((section(".struct_init")))
mystruct_t struct_init = {
.val1 = 1,.val2 = 2};
__attribute__((section(".struct_init_const")))
extern const mystruct_t struct_init_const = {
.val1 = 1, .val2 = 2};
__attribute__((section(".struct_not_init")))
mystruct_t struct_not_init;
main.cpp
#include <stdint.h>
// This symbols exported by linker
// contains addresses of corresponding sections
extern uintptr_t LNK_STRUCT_INIT_ADDR;
extern uintptr_t LNK_STRUCT_INIT_CONST_ADDR;
extern uintptr_t LNK_STRUCT_NOT_INIT_ADDR;
// Pointers for indirect access to data
mystruct_t* struct_init_ptr = (mystruct_t*)LNK_STRUCT_INIT_ADDR;
const mystruct_t* struct_init_const_ptr = (const mystruct_t*)LNK_STRUCT_INIT_CONST_ADDR;
mystruct_t* struct_not_init_ptr = (mystruct_t*)LNK_STRUCT_NOT_INIT_ADDR;
// Extern variables declarations for DIRECT access data
extern mystruct_t struct_init;
extern const mystruct_t struct_init_const;
extern mystruct_t struct_not_init;
// This is some variables representing config values
// They can be more complex objects(classes) with internal state and logic..
int param1_direct;
int param1_init_const_direct;
int param1_not_init_direct;
int param1_indirect;
int param2_init_const_indirect;
int param1_not_init_indirect;
int main(void)
{
// local variables init with direct access
int param1_direct_local = struct_init.val1;
int param1_init_const_direct_local = struct_init_const.val1;
int param1_not_init_direct_local = struct_not_init.val1;
// local variables init with indirect access
int param1_indirect_local = struct_init_ptr->val1;
int param2_init_const_indirect_local = struct_init_const_ptr->val1;
int param1_not_init_indirect_local = struct_not_init_ptr->val1;
//global variables init direct
param1_direct = struct_init.val1;
param1_init_const_direct = struct_init_const.val1;
param1_not_init_direct = struct_not_init.val1;
//global variables init indirect
param1_indirect = struct_init_ptr->val1;
param2_init_const_indirect = struct_init_const_ptr->val1;
param1_not_init_indirect = struct_not_init_ptr->val1;
while(1){
// use all variables we init above
// usage of variables may also occure in some functions or methods
// directly or indirectly called from this loop
}
}
I wanna be sure that initialization of param1_ variables will lead to fetch data from flash. Because data in flash section20 can be changed using bootloader(at the moment when main firmware is not running).
The question is: Can LTO(and other optimizations) throw away fetches from flash and just substitute known values because they are known at link time because of initialization.
What approach is better?
If LTO can substitute values - then initialization should be avoided?
I know volatile can help, but is it really needed in this situation?
Code exampe shows different approaches of accessing and initializing data.
not_init version seems to be the best, because compiler can't substitute anything. But it will be a good idea to have some default parameters, so i'd prefer init version if it can be used.
What approach should be chosen?
Currently i am using GCC 4.9.3 but this is general question about any C/C++ compiler.
C and C++ both feature extern variables, which lets you define constants without immediately giving away their values:
// .h
extern int const param1;
extern char const* const param2;
// ...
In general you would define them in a (single) source file, which would hide them away from anything not in this source file. This is not LTO resilient, of course, but if you can disable LTO it is an easy enough strategy.
If disabling LTO is not an option, another solution is to not define them, let LTO produce a binary, and then use a script to splice the definitions in the produced binary in the right section (the one that can be flashed).
With the value not available at LTO time, you are guaranteed that it will not be substituted.
As for the solutions you presented, while volatile is indeed a standard compliant solution, it implies that the value is not constant, which prevents caching it during run-time. Whether this is acceptable or not is for you to know, just be aware it might have a performance impact, which as you are using LTO I surmised you would like to avoid.
One of the namespaces in my program is spread between two files. One provides the "engine", the other uses the "engine" to perform various commands. All of the initializations are performed on the "engine" side, including caching parameters fetched from setup library.
So, there's engine.cpp with:
#include <stdio.h>
#include "ns.h"
namespace MyNS
{
unsigned char variable = 0;
void init()
{
variable = 5;
printf("Init: var = %d\n",variable);
}
void handler()
{
// printf("Handler: var = %d\n",variable);
}
}
The variable happens never to be used again in engine.cpp but it's extensively used in commands.cpp.
#include <stdio.h>
#include "ns.h"
namespace MyNS
{
extern unsigned char variable;
void command()
{
printf("Command: var = %d\n",variable);
}
}
After compiling and linking, I'm getting:
Init: var = 5
Command: var = 1
Now, if I uncomment the printf() in handler() I'm getting:
Engine: var = 5
Command: var = 5
Handler: var = 5
What would be the "correct" way to force GCC not to optimize it away in such a way that accessing it through extern from the other file would fetch the right value? Preferably without reducing the -O level for the rest of the application?
(for completeness case, main.h and ns.h: )
#include "ns.h"
int main(int argc, char** argv)
{
MyNS::init();
MyNS::command();
MyNS::handler();
return 0;
}
namespace MyNS
{
void init();
void command();
void handler();
}
This minimized testcase doesn't exhibit this particular behavior; it seems one needs this situation to occur in much more complex environment to happen...
eh... the solution was quite trivial.
I exchanged places of the declaration and definition of the variable.
engine.cpp:
extern unsigned char variable;
command.cpp:
unsigned char variable = 0;
That way the compiler has no doubts about need for this variable's existence while compiling commands and in engine it has to reach to the existing instance, it can't just create a temporary one on the spot.
EDIT: Now I've discovered another peculiarity. The value changes depending on where it's written to. The section of code in question is:
1: varso = SharedObject::Instance()->varso;
2: memset(det_map,0,sizeof(det_map));
3: memset(gr_map,0xFF,sizeof(gr_map));
4: memset(gr_ped,false,sizeof(gr_ped));
5: memset(&stan,0,sizeof(stan));
6: stan.SOTUstage = 1;
7: PR_SOTU = varso->NrPSOTU;
The variable occurs near a place where several arrays are initialized with memset. The variable in question is PR_SOTU (the uppercase is inherited from when it was still a macro, and since it acts along with several other macros acting in a very similar context, it's likely to stay that way).
If move the assignment from its line 7 and place it after lines 1, 2 or 3, it receives the correct value 5. Placed after line 4 it gets the value 18. Anything below, and the value is 1. I moved definition of the variable to a different place (it was the last on the list of all namespace-globals, now it's first) to exclude possibility something writes at that specific memory location, but the behavior remains.
In the C language, in order to initialize a static local variable to a value unknown during compilation, I would normally do something like this (for example):
void func()
{
static int var = INVALID_VALUE;
if (var == INVALID_VALUE)
var = some_other_func();
...
}
In the C++ language, I can simply do:
void func()
{
static int i = some_other_func();
...
}
The only way (that I can think of) for a C++ compiler to resolve it properly, is by replacing this code with a mechanism similar to the C example above.
But how would the compiler determine a "proper" invalid value? Or is there another way which I haven't taken into consideration?
Thanks
Clarification:
INVALID_VALUE is a value which function some_other_func never returns.
It is used in order to ensure that this function is never invoked more than once.
The compiler will not generate code to do it based on its value but on a thread safe flag that ensure that the code is only executed once.
Something like that:
void func()
{
static int i;
static bool i_initialized;
if (!i_initialized) {
i = some_other_func();
i_initialized = true;
}
}
Except that generally it is not a bool but a thread safe way of testing it.
According to code seen by disassembling and debugging the g++ compiled code, there is a hidden variable that is initialized to 0 and when the initialization is run it is set to 1.
So the next time the initialization code isn't executed.