Embed C++ compiler in application - c++

Aren't shaders cool? You can toss in just a plain string and as long as it is valid source, it will compile, link and execute. I was wondering if there is a way to embed GCC inside a user application so that it is "self sufficient" e.g. has the internal capability to compile native binaries compatible to itself.
So far I've been invoking stand alone GCC from a process, started inside the application, but I was wondering if there is some API or something that could allow to use "directly" rather than a standalone compiler. Also, in the case it is possible, is it permitted?
EDIT: Although the original question was about CGG, I'd settle for information how to embed LLVM/Clang too.
And now a special edit for people who cannot put 2 + 2 together: The question asks about how to embed GCC or Clang inside of an executable in a way that allows an internal API to be used from code rather than invoking compilation from a command prompt.

I'd add +1 to the suggestion to use Clang/LLVM instead of GCC. A few good reasons why:
it is more modular and flexible
compilation time can be substantially lower than GCC
it supports the platforms you listed in the comments
it has an API that can be used internally
string source = "app.c";
string target= "app";
llvm::sys::Path clangPath = llvm::sys::Program::FindProgramByName("clang");
// arguments
vector<const char *> args;
args.push_back(clangPath.c_str());
args.push_back(source.c_str());
args.push_back("-l");
args.push_back("curl");
clang::TextDiagnosticPrinter *DiagClient = new clang::TextDiagnosticPrinter(llvm::errs(), clang::DiagnosticOptions());
clang::IntrusiveRefCntPtr<clang::DiagnosticIDs> DiagID(new clang::DiagnosticIDs());
clang::DiagnosticsEngine Diags(DiagID, DiagClient);
clang::driver::Driver TheDriver(args[0], llvm::sys::getDefaultTargetTriple(), target, true, Diags);
clang::OwningPtr<clang::driver::Compilation> c(TheDriver.BuildCompilation(args));
int res = 0;
const clang::driver::Command *FailingCommand = 0;
if (c) res = TheDriver.ExecuteCompilation(*c, FailingCommand);
if (res < 0) TheDriver.generateCompilationDiagnostics(*c, FailingCommand);

Yes, it is possible, for example, QEMU does it.
I don't have any personal experience in this field, but from what I've read, it seems that LLVM might be better suited for embedding and extending than GCC.

Some older list of C++ compilers and interpreters is available at http://www.thefreecountry.com/compilers/cpp.shtml.
Answer to the "self sufficient" application is usually a good language interpreter. There are many of them out there, many compile the code into a byte code files. Very popular and easily embeddable is the Lua language interpreter. Even some strong players use it.
There was also an open source C++ interpreter with great language compatibility produced years ago starting with F.. Don't remember the rest of the name. There are also many other tools able to produce native binaries (e.g. Free Pascal).
Choice of the language and the target platform depends on the intentions. What would be the "self sufficiency" good for. Who will write those libraries. Once you have that clear - use Google - there is a wildlife out there. One of the latest beasts is the open sourced C# compiler "Roslyn"
EDIT
If you need some C compiler (as you generate C subset) that can be "embedded" you are probably looking for a "portable C compiler" in the sense that you can put it on USB stick and carry with you. Portable applications can be easily "embedded" into other applications and can be easily included in the installer.
Possibility to "embed" compiler as statically linked code into main application binary is probably not required.
Some reference to portable MinGW is described in this https://stackoverflow.com/questions/7617410/portable-c-compiler-ide SO question.
An open source C++ editor with integrated MinGW is here https://code.google.com/p/pocketcpp/.
I don't have anything more to say as I'd have to go and browse Google - so I will not win the bounty :)

Why not just call the compiler and linker from your application using fork()/exec() (for UNIX-like platforms)? Create a shared library that you can then load with dlopen().
This avoids possible licensing issues and gives you less of a maintenance burden.
This is e.g. what varnish does with its configuration files;
The VCL language is a small domain-specific language designed to be used to define request handling and document caching policies for Varnish Cache.
When a new configuration is loaded, the varnishd management process translates the VCL code to C and compiles it to a shared object which is then dynamically linked into the server process.

Related

How to give someone a c/c++ programme without giving them the source

I want to give a C++ programme to someone for testing but I don't want them to see the source just yet. My main issues are that I don't know what platform that person is using and I don't want to create a shared library unless I have no other option. Ideally, I would like to send headers and object files for the person to compile and link him/herself but as far as I know that would only work if the person has the same set up that I have.
I am currently using Windows but I'm comfortable working on any Unix-like system as well and I am not using an IDE, in case you need that information
Well, a Windows development environment allows you to bind some native always backward compatible winapi functions. The distribution of correctly setup binary .dll files, along with consistent headers, is enough.
For Linux distributions, the scenario is different, since you need to have a distributed package compiled from source (that's disclosed), or distributed binaries for every Linux distributions you actually want to support.
If you want to avoid source code disclosure, where it's needed to compile on specific target systems, use a licencing mechanism that's preventing to run it.
Assuming the choice of machine is "reasonable" - in other words, it's something running Linux, Windows, Android or MacOS and a reasonable target processor such as MIPS, Sparc, x86 or ARM, then one POSSIBLE solution is to use clang -S -emit-llvm yourfile.cpp to produce an intermediate form of the LLVM "virtual machine code". This can then, using llc, be compiled to machine code for any target that LLVM supports.
It's not completely impossible to figure out roughly what the source code looked like, but unless someone wants to put a LOT of effort into running your code, they won't be able to see what the code does. And even giving someone a binary allows them, if they are that way inclined, to reverse engineer the code.
The other alternative, as I see it, is that you demonstrate the code on your machine [or a machine under your control].
There are also tools that can "obfuscate" source-code (rename variables, structure/class members and functions to a, b, c; remove any comments; and "unformat" the code - all of which makes it much harder to understand what the code does). Sorry, you'll have to google to find a good one, as I have never used such a thing myself. And again, of course, it's not IMPOSSIBLE to recover the code into something that can be used and modified and rebuilt. There is really no way to avoid giving the customer something they can compile unless you know what OS/processor it is for.

C++ IDE with repl?

I'm looking for a good C++ IDE with a REPL. The one in visual studio isn't... well lets say most of the time if I copy/paste a line in source the REPL rejects it even if its the line I put a breakpoint or step over.
Are there any good IDEs or REPLs for C++?
Cling
What is Cling?
Cling is an interactive C++ interpreter, built on the top of LLVM and Clang libraries. Its advantages over the standard interpreters are that it has command line prompt and uses just-in-time (JIT) compiler for compilation. Many of the developers (e.g. Mono in their project called CSharpRepl) of such kind of software applications name them interactive compilers.
One of Cling's main goals is to provide contemporary, high-performance alternative of the current C++ interpreter in the ROOT project - CINT. The backward-compatibility with CINT is major priority during the development.
http://root.cern.ch/drupal/content/cling
CINT
What is CINT?
CINT is an interpreter for C and C++ code. It is useful e.g. for situations where rapid development is more important than execution time. Using an interpreter the compile and link cycle is dramatically reduced facilitating rapid development. CINT makes C/C++ programming enjoyable even for part-time programmers.
CINT is written in C++ itself, with slightly less than 400,000 lines of code. It is used in production by several companies in the banking, integrated devices, and even gaming environment, and of course by ROOT, making it the default interpreter for a large number of high energy physicists all over the world.
http://www.hanno.jp/gotom/Cint.html
CLing should be independant of Clang and abble to compile on any platform, recent works of CERN tend to separate Cling from Clang and it's good trends.
What I don't under is mostly the existence of Clipp in C++ allowing to parse javascript embedded in my C++ code and can't find a version of Clipp for just C++ / Boost / Eigen / Quantlib.
Another thing I don't understand is why TinyCC with a 200ko size is abble to parse windows.h without a problem and LLVM team complaining about Clang on windows.H detonate on tarmac.
All in all, with fusion, spirit, wave and the so many people wishing for a C++ REPL, why after 2 decade there is not even small version of it.
Here is my solutions
Forget about REPL C++ and stick to REPL C, use tinyCC and expose only the functionnal action of method by using pointer function A.function(toto t) -> function(A *, toto t). To make it works with object method you can also use declaration as struct __declspec(novtable) A { };
This will allow binary align compatibility between tinyCC struct understanding and your true object. True you will have to split the tuple of data and the tuple of method, but after all, that should have always be the case in first place. Object design should have split data and method into a dual model rather than a mixed model good for bug. In many case, the compiler will split the object into dual model anyway. This will provide extrem fast prototyping even for scientist and user of Cling/Cint.
Second solution, rather than REPL statement, use the dynamic load/unload pair, you setup a chain of compilation ( incremental build or not ) and auto relink compiled library when the source change. It isn't slow at all. This give the advantage to do it on any supported dynamic library OS and it's very EASY to do.
Third solution, the most easiest way, boot a linux based vm ( install llvm tool chain), and use Cling on the vm. That won't work in a firm full windowOSed but it seems LLVM are windows OS haters.

C++ compilation at runtime

So I watched this video months back where Epic showcased their new Unreal Engine 4 developer features. Sorry I could not find the video but I'll try my best to explain.
The one feature that got my attention was the C++ "on the fly" modification and compilation. The guy showed how he was playing the game in editor and modified some variables in the code, saved it and the changes were immediately mirrored in game.
So I've been wondering... How does one achieve this? Currently I can think of two possible ways: either a hoax and it was only "c style"-scripting language not C++ itself OR it's shared library (read: DLL) magic.
Here's something I whipped up to try myself(simplified):
for(;;)
{
Move DLL from build directory to execution directory
Link to DLL using LoadLibrary() / dlopen() and fetch a pointer to function "foo()" from it
for(;;)
{
Call the function "foo()" found in the dll
Check the source of the dll for changes
If it has changed, attempt to compile
On succesfull compile, break
}
Unload DLL with FreeLibrary() / dlclose()
}
Now that seems to work but I can't help but wonder what other methods are there?
And how would this kind of approach compare versus using a scripting language?
edit: https://www.youtube.com/watch?v=MOvfn1p92_8&t=10m0s
Yes, "hot code modification" it's definitely a capability that many IDEs/debuggers can have to one extent or another. Here's a good article:
http://www.technochakra.com/debugging-modifying-code-at-runtime/
Here's the man page for MSVS "Edit and Continue":
http://msdn.microsoft.com/en-us/library/esaeyddf%28v=vs.80%29.aspx
Epic's Hot Reload works by having the game code compiled and loaded as a dll/.so, and this is then dynamically loaded by the engine. Runtime reloading is then simply re-compiling the library and reloading, with state stored and restored if needed.
There are alternatives. You could take a look at Runtime Compiled C++ (or see RCC++ blog and videos), or perhaps try one of the other alternatives I've listed on the wiki.
As the author of Runtime Compiled C++ I think it has some advantages - it's far faster to recompile since it only compiles the code you need, and the starting point can be a single exe, so you don't need to configure a seperate project for a dll. However it does require some learning and a bit of extra code to get this to work.
Neither C nor C++ require ahead-of-time compilation, although the usual target environments (operating systems, embedded systems, high-performance number crunching) often benefit greatly from AOT.
It's completely possible to have a C++ script interpreter. As long as it adheres to the behavior in the Standard, it IS C++, and not a "hoax" as you suggest.
Replacement of a shared library is also possible, although to make it work well you'll need a mechanism for serializing all state and reloading it under the new version.

Define C++ function at runtime

I'm trying to adjust some mathematical code I've written to allow for arbitrary functions, but I only seem to be able to do so by pre-defining them at compile time, which seems clunky. I'm currently using function pointers, but as far as I can see the same problem would arise with functors. To provide a simplistic example, for forward-difference differentiation the code used is:
double xsquared(double x) {
return x*x;
}
double expx(double x) {
return exp(x);
}
double forward(double x, double h, double (*af)(double)) {
double answer = (af(x+h)-af(x))/h;
return answer;
}
Where either of the first two functions can be passed as the third argument. What I would like to do, however, is pass user input (in valid C++) rather than having to set up the functions beforehand. Any help would be greatly appreciated!
Historically the kind of functionality you're asking for has not been available in C++. The usual workaround is to embed an interpreter for a language other than C++ (Lua and Python for example are specifically designed for being integrated into C/C++ apps to allow scripting of them), or to create a new language specific to your application with your own parser, compiler, etc. However, that's changing.
Clang is a new open source compiler that's having its development by Apple that leverages LLVM. Clang is designed from the ground up to be usable not only as a compiler but also as a C++ library that you can embed into your applications. I haven't tried it myself, but you should be able to do what you want with Clang -- you'd link it as a library and ask it to compile code your users input into the application.
You might try checking out how the ClamAV team already did this, so that new virus definitions can be written in C.
As for other compilers, I know that GCC recently added support for plugins. It maybe possible to leverage that to bridge GCC and your app, but because GCC wasn't designed for being used as a library from the beginning it might be more difficult. I'm not aware of any other compilers that have a similar ability.
As C++ is a fully compiled language, you cannot really transform user input into code unless you write your own compiler or interpreter. But in this example, it can be possible to build a simple interpreter for a Domain Specific Language which would be mathematical formulae. All depends on what you want to do.
You could always take the user's input and run it through your compiler, then executing the resulting binary. This of course would have security risks as they could execute any arbitrary code.
Probably easier is to devise a minimalist language that lets users define simple functions, parsing them in C++ to execute the proper code.
The best solution is to use an embedded language like lua or python for this type of task. See e.g. Selecting An Embedded Language for suggestions.
You may use tiny C compiler as library (libtcc).
It allows you to compile arbitrary code in run-time and load it, but it is only works for C not C++.
Generally the only way is following:
Pass the code to compiler and create shared object or DLL
Load this Shared object or DLL
Use function from this shared object.
C++, unlike some other languages like Perl, isn't capable of doing runtime interpretation of itself.
Your only option here would be to allow the user to compile small shared libraries that could be dynamically-loaded by your application at runtime.
Well, there are two things you can do:
Take full advantage of boost/C++0x lambda's and to define functions at runtime.
If only mathematical formula's are needed, libraries like muParser are designed to turn a string into bytecode, which can be seen as defining a function at runtime.
While it seems like a blow off, there are a lot of people out there who have written equation parsers and interpreters for c++ and c, many commercial, many flawed, and all as different as faces in a crowd. One place to start is the college guys writing infix to postfix translators. Some of these systems use paranthetical grouping followed by putting the items on a stack like you would find in the old HP STL library. I spent 30 seconds and found this one:
http://www.speqmath.com/tutorials/expression_parser_cpp/index.html
possible search string:"gcc 'equation parser' infix to postfix"

Any tutorial for embedding Clang as script interpreter into C++ Code?

I have no experience with llvm or clang, yet. From what I read clang is said to be easily embeddable Wikipedia-Clang, however, I did not find any tutorials about how to achieve this. So is it possible to provide the user of a c++ application with scripting-powers by JIT compiling and executing user-defined code at runtime? Would it be possible to call the applications own classes and methods and share objects?
edit: I'd prefer a C-like syntax for the script-languge (or even C++ itself)
I don't know of any tutorial, but there is an example C interpreter in the Clang source that might be helpful. You can find it here: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
You probably won't have much of a choice of syntax for your scripting language if you go this route. Clang only parses C, C++, and Objective C. If you want any variations, you may have your work cut out for you.
I think here's what exactly you described.
http://root.cern.ch/drupal/content/cling
You can use clang as a library to implement JIT compilation as stated by other answers.
Then, you have to load up the compiled module (say, an .so library).
In order to accomplish this, you can use standard dlopen (unix) or LoadLibrary (windows) to load it, then use dlsym (unix) to dynamically reference compiled functions, say a "script" main()-like function whose name is known. Note that for C++ you would have to use mangled symbols.
A portable alternative is e.g. GNU's libltdl.
As an alternative, the "script" may run automatically at load time by implementing module init functions or putting some static code (the constructor of a C++ globally defined object would be called immediately).
The loaded module can directly call anything in the main application. Of course symbols are known at compilation time by using the proper main app's header files.
If you want to easily add C++ "plugins" to your program, and know the component interface a priori (say your main application knows the name and interface of a loaded class from its .h before the module is loaded in memory), after you dynamically load the library the class is available to be used as if it was statically linked. Just be sure you do not try to instantiate a class' object before you dlopen() its module.
Using static code allows to implement nice automatic plugin registration mechanisms too.
I don't know about Clang but you might want to look at Ch:
http://www.softintegration.com/
This is described as an embeddable or stand-alone c/c++ interpreter. There is a Dr. Dobbs article with examples of embedding it here:
http://www.drdobbs.com/architecture-and-design/212201774
I haven't done more than play with it but it seems to be a stable and mature product. It's commercial, closed-source, but the "standard" version is described as free for both personal and commercial use. However, looking at the license it seems that "commercial" may only include internal company use, not embedding in a product that is then sold or distributed. (I'm not a lawyer, so clearly one should check with SoftIntegration to be certain of the license terms.)
I am not sure that embedding a C or C++ compiler like Clang is a good idea in your case. Because the "script", that is the (C or C++) code fed (at runtime!) can be arbitrary so be able to crash the entire application. You usually don't want faulty user input to be able to crash your application.
Be sure to read What every C programmer should know about undefined behavior because it is relevant and applies to C++ also (including any "C++ script" used by your application). Notice that, unfortunately, a lot of UB don't crash processes (for example a buffer overflow could corrupt some completely unrelated data).
If you want to embed an interpreter, choose something designed for that purpose, like Guile or Lua, and be careful that errors in the script don't crash the entire application. See this answer for a more detailed discussion of interpreter embedding.