C++ compilation at runtime

C++ compilation at runtime - c++

So I watched this video months back where Epic showcased their new Unreal Engine 4 developer features. Sorry I could not find the video but I'll try my best to explain.
The one feature that got my attention was the C++ "on the fly" modification and compilation. The guy showed how he was playing the game in editor and modified some variables in the code, saved it and the changes were immediately mirrored in game.
So I've been wondering... How does one achieve this? Currently I can think of two possible ways: either a hoax and it was only "c style"-scripting language not C++ itself OR it's shared library (read: DLL) magic.
Here's something I whipped up to try myself(simplified):
for(;;)
{
Move DLL from build directory to execution directory
Link to DLL using LoadLibrary() / dlopen() and fetch a pointer to function "foo()" from it
for(;;)
{
Call the function "foo()" found in the dll
Check the source of the dll for changes
If it has changed, attempt to compile
On succesfull compile, break
}
Unload DLL with FreeLibrary() / dlclose()
}
Now that seems to work but I can't help but wonder what other methods are there?
And how would this kind of approach compare versus using a scripting language?
edit: https://www.youtube.com/watch?v=MOvfn1p92_8&t=10m0s

Yes, "hot code modification" it's definitely a capability that many IDEs/debuggers can have to one extent or another. Here's a good article:
http://www.technochakra.com/debugging-modifying-code-at-runtime/
Here's the man page for MSVS "Edit and Continue":
http://msdn.microsoft.com/en-us/library/esaeyddf%28v=vs.80%29.aspx

Epic's Hot Reload works by having the game code compiled and loaded as a dll/.so, and this is then dynamically loaded by the engine. Runtime reloading is then simply re-compiling the library and reloading, with state stored and restored if needed.
There are alternatives. You could take a look at Runtime Compiled C++ (or see RCC++ blog and videos), or perhaps try one of the other alternatives I've listed on the wiki.
As the author of Runtime Compiled C++ I think it has some advantages - it's far faster to recompile since it only compiles the code you need, and the starting point can be a single exe, so you don't need to configure a seperate project for a dll. However it does require some learning and a bit of extra code to get this to work.

Neither C nor C++ require ahead-of-time compilation, although the usual target environments (operating systems, embedded systems, high-performance number crunching) often benefit greatly from AOT.
It's completely possible to have a C++ script interpreter. As long as it adheres to the behavior in the Standard, it IS C++, and not a "hoax" as you suggest.
Replacement of a shared library is also possible, although to make it work well you'll need a mechanism for serializing all state and reloading it under the new version.

Related

Precompile script into objects inside C++ application

I need to provide my users the ability to write mathematical computations into the program. I plan to have a simple text interface with a few buttons including those to validate the script grammar, save etc.
Here's where it gets interesting. These functions the user is writing need to execute at multi-megabyte line speeds in a communications application. So I need the speed of a compiled language, but the usage of a script. A fully interpreted language just won't cut it.
My idea is to precompile the saved user modules into objects at initialization of the C++ application. I could then use these objects to execute the code when called upon. Here are the workflows I have in mind:
1) Testing(initial writing) of script: Write code in editor, save, compile into object (testing grammar), run with test I/O, Edit Code
2) Use of Code (Normal operation of application): Load script from file, compile script into object, Run object code, Run object code, Run object code, etc.
I've looked into several off the shelf interpreters, but can't find what I'm looking for. I considered JAVA, as it is pretty fast, but I would need to load the JAVA virtual machine, which means passing objects between C and the virtual machine... The interface is the bottleneck here. I really need to create a native C++ object running C++ code if possible. I also need to be able to run the code on multiple processors effectively in a controlled manner.
I'm not looking for the whole explanation on how to pull this off, as I can do my own research. I've been stalled for a couple days here now, however, and I really need a place to start looking.
As a last resort, I will create my own scripting language to fulfill the need, but that seems a waste with all the great interpreters out there. I've also considered taking an existing open source complier and slicing it up for the functionality I need... just not saving the compiled results to disk... I don't know. I would prefer to use a mainline language if possible... but that's not required.
Any help would be appreciated. I know this is not your run of the mill idea I have here, but someone has to have done it before.
Thanks!
P.S.
One thought that just occurred to me while writing this was this: what about using a true C compiler to create object code, save it to disk as a dll library, then reload and run it inside "my" code? Can you do that with MS Visual Studio? I need to look at the licensing of the compiler... how to reload the library dynamically while the main application continues to run... hmmmmm I could then just group the "functions" created by the user into library groups. Ok that's enough of this particular brain dump...

A possible solution could be use gcc (MingW since you are on windows) and build a DLL out of your user defined code. The DLL should export just one function. You can use the win32 API to handle the DLL (LoadLibrary/GetProcAddress etc.) At the end of this job you have a C style function pointer. The problem now are arguments. If your computation has just one parameter you can fo a cast to double (*funct)(double), but if you have many parameters you need to match them.

I think I've found a way to do this using standard C.
1) Standard C needs to be used because when it is compiled into a dll, the resulting interface is cross compatible with multiple compilers. I plan to do my primary development with MS Visual Studio and compile objects in my application using gcc (windows version)
2) I will expose certain variables to the user (inputs and outputs) and standardize them across units. This allows multiple units to be developed with the same interface.
3) The user will only create the inside of the function using standard C syntax and grammar. I will then wrap that function with text to fully define the function and it's environment (remember those variables I intend to expose?) I can also group multiple functions under a single executable unit (dll) using name parameters.
4) When the user wishes to test their function, I dump the dll from memory, compile their code with my wrappers in gcc, and then reload the dll into memory and run it. I would let them define inputs and outputs for testing.
5) Once the test/create step was complete, I have a compiled library created which can be loaded at run time and handled via pointers. The inputs and outputs would be standardized, so I would always know what my I/O was.
6) The only problem with standardized I/O is that some of the inputs and outputs are likely to not be used. I need to see if I can put default values in or something.
So, to sum up:
Think of an app with a text box and a few buttons. You are told that your inputs are named A, B, and C and that your outputs are X, Y, and Z of specified types. You then write a function using standard C code, and with functions from the specified libraries (I'm thinking math etc.)
So now your done... you see a few boxes below to define your input. You fill them in and hit the TEST button. This would wrap your code in a function context, dump the existing dll from memory (if it exists) and compile your code along with any other functions in the same group (another parameter you could define, basically just a name to the user.) It then runs the function using a functional pointer, using the inputs defined in the UI. The outputs are sent to the user so they can determine if their function works. If there are any compilation errors, that would also be outputted to the user.
Now it's time to run for real. Of course I kept track of what functions are where, so I dynamically open the dll, and load all the functions into memory with functional pointers. I start shoving data into one side and the functions give me the answers I need. There would be some overhead to track I/O and to make sure the functions are called in the right order, but the execution would be at compiled machine code speeds... which is my primary requirement.
Now... I have explained what I think will work in two different ways. Can you think of anything that would keep this from working, or perhaps any advice/gotchas/lessons learned that would help me out? Anything from the type of interface to tips on dynamically loading dll's in this manner to using the gcc compiler this way... etc would be most helpful.
Thanks!

Embed C++ compiler in application

Aren't shaders cool? You can toss in just a plain string and as long as it is valid source, it will compile, link and execute. I was wondering if there is a way to embed GCC inside a user application so that it is "self sufficient" e.g. has the internal capability to compile native binaries compatible to itself.
So far I've been invoking stand alone GCC from a process, started inside the application, but I was wondering if there is some API or something that could allow to use "directly" rather than a standalone compiler. Also, in the case it is possible, is it permitted?
EDIT: Although the original question was about CGG, I'd settle for information how to embed LLVM/Clang too.
And now a special edit for people who cannot put 2 + 2 together: The question asks about how to embed GCC or Clang inside of an executable in a way that allows an internal API to be used from code rather than invoking compilation from a command prompt.

I'd add +1 to the suggestion to use Clang/LLVM instead of GCC. A few good reasons why:
it is more modular and flexible
compilation time can be substantially lower than GCC
it supports the platforms you listed in the comments
it has an API that can be used internally
string source = "app.c";
string target= "app";
llvm::sys::Path clangPath = llvm::sys::Program::FindProgramByName("clang");
// arguments
vector<const char *> args;
args.push_back(clangPath.c_str());
args.push_back(source.c_str());
args.push_back("-l");
args.push_back("curl");
clang::TextDiagnosticPrinter *DiagClient = new clang::TextDiagnosticPrinter(llvm::errs(), clang::DiagnosticOptions());
clang::IntrusiveRefCntPtr<clang::DiagnosticIDs> DiagID(new clang::DiagnosticIDs());
clang::DiagnosticsEngine Diags(DiagID, DiagClient);
clang::driver::Driver TheDriver(args[0], llvm::sys::getDefaultTargetTriple(), target, true, Diags);
clang::OwningPtr<clang::driver::Compilation> c(TheDriver.BuildCompilation(args));
int res = 0;
const clang::driver::Command *FailingCommand = 0;
if (c) res = TheDriver.ExecuteCompilation(*c, FailingCommand);
if (res < 0) TheDriver.generateCompilationDiagnostics(*c, FailingCommand);

Yes, it is possible, for example, QEMU does it.
I don't have any personal experience in this field, but from what I've read, it seems that LLVM might be better suited for embedding and extending than GCC.

Some older list of C++ compilers and interpreters is available at http://www.thefreecountry.com/compilers/cpp.shtml.
Answer to the "self sufficient" application is usually a good language interpreter. There are many of them out there, many compile the code into a byte code files. Very popular and easily embeddable is the Lua language interpreter. Even some strong players use it.
There was also an open source C++ interpreter with great language compatibility produced years ago starting with F.. Don't remember the rest of the name. There are also many other tools able to produce native binaries (e.g. Free Pascal).
Choice of the language and the target platform depends on the intentions. What would be the "self sufficiency" good for. Who will write those libraries. Once you have that clear - use Google - there is a wildlife out there. One of the latest beasts is the open sourced C# compiler "Roslyn"
EDIT
If you need some C compiler (as you generate C subset) that can be "embedded" you are probably looking for a "portable C compiler" in the sense that you can put it on USB stick and carry with you. Portable applications can be easily "embedded" into other applications and can be easily included in the installer.
Possibility to "embed" compiler as statically linked code into main application binary is probably not required.
Some reference to portable MinGW is described in this https://stackoverflow.com/questions/7617410/portable-c-compiler-ide SO question.
An open source C++ editor with integrated MinGW is here https://code.google.com/p/pocketcpp/.
I don't have anything more to say as I'd have to go and browse Google - so I will not win the bounty :)

Why not just call the compiler and linker from your application using fork()/exec() (for UNIX-like platforms)? Create a shared library that you can then load with dlopen().
This avoids possible licensing issues and gives you less of a maintenance burden.
This is e.g. what varnish does with its configuration files;
The VCL language is a small domain-specific language designed to be used to define request handling and document caching policies for Varnish Cache.
When a new configuration is loaded, the varnishd management process translates the VCL code to C and compiles it to a shared object which is then dynamically linked into the server process.

A boot loader in C++

I have messed around a few times by making a small assembly boot loader on a floppy disk and was wondering if it's possible to make a boot loader in c++ and if so where might I begin? For all I know im not sure it would even use int main().
Thanks for any help.

If you're writing a boot loader, you're essentially starting from nothing: a small chunk of code is loaded into memory, and executed. You can write the majority of your boot loader in C++, but you will need to bootstrap your own C++ runtime environment first.
Assembly is really the only option for the first stage, as you need to set up a sensible environment for running anything higher-level. Doing enough to run C code is fairly straightforward -- you need:
code and data loaded in the right place;
there may be an additional part of the data area which must be zero-initialised;
you need to point the stack pointer at a suitable area of memory for the stack.
Then you can jump into the code at an appropriate point (e.g. main()) and expect that the basic language features will work. (It's possible that any features of the standard library that may have been implemented or linked in might require additional initialisation at this stage.)
Getting a suitable environment going for C++ requires more effort, as it needs more initialisation here, and also has core language features which require runtime support (again, this is before considering library features). These include:
running static constructors;
memory allocation to support new and delete;
support for run-time type information (RTTI);
support for exceptions;
probably some other things I've forgotten to mention.
None of these are required until the C environment is up and running, so the code that handles these can be written in C rather than assembler (or even in a subset of C++ that does not make use of the above features).
(The same principles apply in embedded systems, and it's not uncommon for such systems to make use of C++, but only in a limited way -- e.g. no exceptions and/or RTTI because the runtime support isn't implemented.)

It's been a while since I played with writing bootloaders, so I'm going off memory.
For an x86 bootloader, you need to have a C++ compiler that can emit x86 assembly, or, at the very least, you need to write your own preamble in 16-bit assembly that will put the CPU into 32-bit protected (or 64-bit long) mode, before you can call your C++ functions.
Once you've done that, though, you should be able to make use of most, if not all, of C++'s language features, so long as you stay away from things that require an underlying libc. But statically link everything without the CRT and you're golden.

Bootloaders don't have "int main()"s, unless you write assembly code to call it.
If you are writing a stage 1 bootloader, then it is seriously discouraged.
Otherwise, the osdev.org has great documentation on the topic.
While it is probably possible to make a bootloader in C++, remember not to link your code to any dynamic libraries, and remember that just because it is C++, that doesn't mean you can/should use the STL, etc.

Yes it is possible. You have elements of answer and usefull links in this question
You also can have a look here, there is a C++ bootloader example.
The main thing to understand is that you need to create a flat binary instead of the usual fancy executable file formats (PE on windows, or ELF on Unixes), because these file format need an OS to load them, and in a boot loader you don't have an OS yet.
Using library is not a problem if you link statically (no dynamic link because again of the above executable problem). But obviously all OS API related entry points are not available...

Any tutorial for embedding Clang as script interpreter into C++ Code?

I have no experience with llvm or clang, yet. From what I read clang is said to be easily embeddable Wikipedia-Clang, however, I did not find any tutorials about how to achieve this. So is it possible to provide the user of a c++ application with scripting-powers by JIT compiling and executing user-defined code at runtime? Would it be possible to call the applications own classes and methods and share objects?
edit: I'd prefer a C-like syntax for the script-languge (or even C++ itself)

I don't know of any tutorial, but there is an example C interpreter in the Clang source that might be helpful. You can find it here: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
You probably won't have much of a choice of syntax for your scripting language if you go this route. Clang only parses C, C++, and Objective C. If you want any variations, you may have your work cut out for you.

I think here's what exactly you described.
http://root.cern.ch/drupal/content/cling

You can use clang as a library to implement JIT compilation as stated by other answers.
Then, you have to load up the compiled module (say, an .so library).
In order to accomplish this, you can use standard dlopen (unix) or LoadLibrary (windows) to load it, then use dlsym (unix) to dynamically reference compiled functions, say a "script" main()-like function whose name is known. Note that for C++ you would have to use mangled symbols.
A portable alternative is e.g. GNU's libltdl.
As an alternative, the "script" may run automatically at load time by implementing module init functions or putting some static code (the constructor of a C++ globally defined object would be called immediately).
The loaded module can directly call anything in the main application. Of course symbols are known at compilation time by using the proper main app's header files.
If you want to easily add C++ "plugins" to your program, and know the component interface a priori (say your main application knows the name and interface of a loaded class from its .h before the module is loaded in memory), after you dynamically load the library the class is available to be used as if it was statically linked. Just be sure you do not try to instantiate a class' object before you dlopen() its module.
Using static code allows to implement nice automatic plugin registration mechanisms too.

I don't know about Clang but you might want to look at Ch:
http://www.softintegration.com/
This is described as an embeddable or stand-alone c/c++ interpreter. There is a Dr. Dobbs article with examples of embedding it here:
http://www.drdobbs.com/architecture-and-design/212201774
I haven't done more than play with it but it seems to be a stable and mature product. It's commercial, closed-source, but the "standard" version is described as free for both personal and commercial use. However, looking at the license it seems that "commercial" may only include internal company use, not embedding in a product that is then sold or distributed. (I'm not a lawyer, so clearly one should check with SoftIntegration to be certain of the license terms.)

I am not sure that embedding a C or C++ compiler like Clang is a good idea in your case. Because the "script", that is the (C or C++) code fed (at runtime!) can be arbitrary so be able to crash the entire application. You usually don't want faulty user input to be able to crash your application.
Be sure to read What every C programmer should know about undefined behavior because it is relevant and applies to C++ also (including any "C++ script" used by your application). Notice that, unfortunately, a lot of UB don't crash processes (for example a buffer overflow could corrupt some completely unrelated data).
If you want to embed an interpreter, choose something designed for that purpose, like Guile or Lua, and be careful that errors in the script don't crash the entire application. See this answer for a more detailed discussion of interpreter embedding.

DLLs and STLs and static data (oh my!)

OK..... I've done all the reading on related questions, and a few MSDN articles, and about a day's worth of googling.
What's the current "state of the art" answer to this question:
I'm using VS 2008, C++ unmanaged code. I have a solution file with quite a few DLLs and quite a few EXEs. As long as I completely control the build environment, such that all pieces and parts are built with the same flags, and use the same runtime libaries, and no one has a statically linked CRT library, am I ok to pass STL objects around?
It seems like this should be OK, but depending on which article you read, there's lots of Fear, Uncertainty, and Doubt.
I know there's all sorts of problems with templates that produce static data behind the scenes (every dll would get their own copy, leading to heartache), but what about regular old STL?

As long as they ALL use the exact same version of runtime DLLs, there should be no problem with STL. But once you happen to have several around, they will use for instance different heaps - leading to no end of troubles.

We successfully pass STL objects around in our application which is made up from dozens of DLLs. To ensure it works one of our automated tests that runs at every build is to verify the settings for all projects. If you add a new project and misconfigure it, or break the configuration of an existing project, the build fails.
The settings we check are as follows. Note not all of these will cause issues, but we check them for consistency.
#defines
_WIN32_WINNT
STRICT
_WIN32_IE
NDEBUG
_DEBUG
_SECURE_SCL
Compiler options
DebugInformationFormat
WholeProgramOptimization
RuntimeLibrary

We use stl collections in our application and pass them to and from methods in different dlls (usually as references). This doesn't cause any trouble.
The only area where we have had trouble is where one dll allocates memory and another dll tries to delete it. This only is reported as bad, but I am not sure why. However it only seems to be a problem on Debug builds (where it is reported), but still works on release builds. Having said that where ever I come across this I do fix it.
If I was writing a 3rd party library I would think twice about using stl parameters in the api. Previously (VC6) we had to use the OCI (Oracles C api) as opposed to OCCI (Oracles C++ api) because it only worked with the Microsoft STL implementation and we were using stlport. Of course if you enable your clients to build the library with their own stl implementation this is not an issue.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js