Anyone with experience on embedding CINT into a C++ app? - c++

I'm talking about ROOT's CINT.
I've been developing a game in c++ which uses Python for programming the AI. As much as I love Python, and how easy it makes programming the AI (generators and FP are really sexy), it makes non trivial algorithms run so slow.
Then I remembered I read somewhere about CINT, and how it can be embeddable. Now I need your help to decide if implement CINT as an alternate scripting system. With python I use Boost::Python, which makes it almost unpainful to expose classes and objects once you get used to it. Is there such ease with CINT?

I've written classes compiled against Root, and then accessed them directly in the interpreter. That's easy, though all such classes are expected to derive from TObject. What I don't know is if that is a cint requirement or a ROOT requirement. you might be best off asking on the RootTalk CINT Support forum
To address the questions in the comments:
The derivation from TObject can be second hand: your classes can be derived from something derived from TObject, it just has to be a TObject.
Root provides a tool (makecint) and some macros (ClassDef and ClassImp) to support integrating your code with the interpreted execution environment: write your clas deriving it from TObject; include the ClassDef macro in the header and the ClassImp macro in the source file; run makecint over the code to generate all the tedious integration nonesense, then compile your code and the generated code to a shared object (or, I presume, a dll on a windows box); start the interpreter; load the library with .L; and your class is fully integrated with the interpreted environment (tab completion will work and all that). The build can be automated with make (and presumable other tools). ##Again,## I don't know how much of this belongs to ROOT and how much to cint. But it is all open source, so you can snag and adapt what you need.

Related

What are the FFI extensions?

I'm coding something in C++ and I would like to offer a scripting layer within my application, during a search for a workable solution I found this FFI extensions but I can't really find proper documentation for them, the guy who invented this or any other "reliable" and technical source, the only things that is clear to me is that this technology is cross-language, in LuaJit, Ruby and Haskell there is someone talking about this stuff but I have no clue about what this "thing" is.
Is something comparable to SWIG ? It's the new kid on the block ?
FFI is a concept. It's what many languages call their glue layer that enables you to call into other languages (very often this bridge is a to a C ABI), and it's thus different for each project. (e.g. this for Erlang
libffi is often used to implement that glue layer in the language, as is SWIG.
Excuse me for a very generic answer, but your questio also seems such..
In contrast to Java, there is no "one FFI" and there's no "one reliable scripting environment". In Java you have the JS Rhino scripting language almost 'built in' into the JRE.
C/C++ have the power of "relatively easily" interfacing with all/any of other platforms. This is because all other platforms are usually implemented in C/C++. Sorry for generalizing and oversimplifying. But, the point is, that IF you had the other-platforms' C++ sourcecodes, you'd add new things to them, just by observing how the standard ones were created.
The point is, that usually all other platforms are either:
closed source
open source, but licensed in a way that prevents you from using it
open source, well-licensed, but LARGE/COMPLICATED enough, that you simply dont want to compile/change their internals
Therefore, every other platform produces on its own some way of calling C APIs. It's just so that i.e. you can use Win32API from Ruby/Python without really recompiling whole Python's runtime just because you wanted to call SendKeys etc.
So, usually, it's the "other platforms" that call into C/C++ than the other way. It's because they all already have a way to do that. Even Java has JNI and .net has P/Invoke, right?
But, now, the calling C++-from-X is possible, depending on X or Y or Z platform, it differs very much. And very often, it works but is tedious to define or just unwieldy. I.e. in Ruby on Windows you can call any C api by:
giving the actual name of the function i.e. _##2AV21moveMeByDistance
giving encoded list of parameter types i.e. ifpp (int,float,pointer,pointer)
giving the proper parameters in proper types and order
if you don't follow, it will crash badly. The above examples are artificial an incorrect by the way. But it looks similar.
So, on almost any platform, there have been many tools to simplify that. On all platforms - different ones. But all are called FFI - line nos said, because it is a concept.
The tools may i.e. require you to annotate the C++ code with some extra information like:
class RUBIZE MyClass :: BaseThing
{
PUBLIC(void,int,int) void moveMeByDistance(int x, int y) { ... }
PUBLIC(void) void .......
};
and then they can process your .h files to generate the "bindings" automatically and they then generate a "binding library" that will adjust the scripting environment so that you can now directly call MyClass::moveMyByDistance(5,6) from script.
Or, they can require you to register all the bits at runtime, ie.:
class MyClass :: ScriptObject
{
void moveMeByDistance(int x, int y) { ... }
void .......
};
int dllmain()
{
ScriptingEnv::RegisterClass<MyClass>();
ScriptingEnv::RegisterClass( "move", &MyClass::moveMeByDistance );
ScriptingEnv::RegisterClass( "eat", .... );
}
but again, all of the above examples are artificial.
As the C/C++ can interface to any other platform (thanks to the fact that any other platform usually has tools for that :P), you have to first decide which scripting language you would like to use. JS? Lua? Ruby? Python? All of them can. Then look for libraries that provide you with the most comfy way of bridging. There are many, but you will have to look at sites/forums/etc related to that very language. Or even here, but then ask about concrete languge/library.
Keywords to look for are:
modules --- i.e. "extending XYZ Script with new modules"
extensions --- i.e. "writing extensions for Ruby in C++"
bindings --- i.e. "Ruby bindings for FastFourierTransform Library"
calling C/C++ code/classes from XYZ languge
etc.

Writing C++ "Scripts"

I am a solo developer on a large C++ library that I use for research (I'm a PhD student). Let's say the library has a bunch of classes that implement cool algorithms: Algorithm1, Algorithm2, etc. I then write a bunch of C-style functions that are stand-alone "scripts" that use the library to either test the recently added functionality or to run simulations that produce plots that I then include in wonderfully-brilliant (I'm in denial) journal publications. The design of the library follows good software engineering principles (to the best of my knowledge and ability), but the "scripts" that link the library from main.cpp do not follow any principle except: "get the job done".
I now have over 300 such "scripts" in a single file (20,000+ lines of code). I have no problem with it, I remain very productive, and that's really the ultimate goal. But I wonder if this approach has major weaknesses that I just have learned to live with.
// File: main.cpp
#include <cool_library/algorithm1.h>
#include <cool_library/algorithm2.h>
...
#include <cool_library/algorithmn.h>
void script1() {
// do stuff that uses some of the cool library's algorithms and data structures
// but none of the other scriptX() functions
}
void script2() {
// do stuff that uses some of the included algorithms and data structures
}
...
// Main function where I comment in the *one* script I want to run.
int main() {
// script1();
// script2();
// script3();
...
script271();
return 0;
}
Edit 1: There are several goals that I have in this process:
Minimize the time it takes to start a new script function.
Make all old script functions available at my finger tips for search. So I can then copy and paste bits of those scripts into a new one. Remember this is NOT supposed to be good design for use by others.
I don't care about the compilation time of the script file because it compiles in under a second as it is now with the 20,000 lines of code.
I use Emacs as my "IDE" by the way, in Linux, using the Autoconf/Automake/Libtool process for building the library and the scripts.
Edit 2: Based on the suggestions, I'm starting to wonder if part of the way to increase productivity in this scenario is not to restructure the code, but to customize/extend the functionality of the IDE (Emacs in my case).
If I were you, I would split that huge file into 300 smaller ones: each would have just one scriptNN() and main() calling just it.
Now, when you have it compiled, you will have 300 small scriptNN executables (you may need to create appropriate Makefile for this though).
What's nice about this - now you can use these script executables as building blocks to be put or called by other scripts, like bash, python, perl, etc.
EDIT Explanation how this design allows to address your goals.
Time to start new script function - simply copy one of existing files and tweak it a little.
Make all old script functions available at my finger tips for search - emacs can do multi-file search across all other script files you have.
I don't care about the compilation time of the script file - it does not matter then. But you will have all of them available to you at once, without editing one big main() and recompiling.
Your example may be a good use case of scripting language. To be more specific, you could all your script* C++ functions glued to some interpreter, like Lua, Python, Ocaml, Guile etc... and have your test cases be written in the scripting language.
All scripting languages enable you to glue your C (hence also C++) functions.
For Lua, see its Lua API chapter. For Python, see its Extending & Embedding Python section. For Ocaml, see Interfacing C with OCaml section. For Guile, see Programming in C chapter.
You may wish to embed the interpreter inside your main function, or you could extend the existing interpreter with your new C++ functions (hence using some main provided by the interpreter).
Notice that using some scripting language may have a profound impact on the design and architecture of your library and software
If you are comfortable with it, and it works for you, just stick with it. You said you are the only developer, then just do whatever you want. I always spend too much time thinking about things like this for my projects :P. I've learned to just focus on the important and productive things. Theoretical things only work in theory...
All the suggested answers are good and you can even combine them. Just to add my 5 cents: your execution flow fits exactly into Strategy and Command design patterns. You may want to look at their benefits, but it's a question of benefit vs. investment.

C++ Introspection: Enumerate available classes and methods in a C++ codebase

I'm working on some custom C++ static code analysis for my PHD thesis. As part of an extension to the C++ type system, I want to take a C++ code base and enumerate its available functions, methods, and classes, along with their type signatures, with minimal effort (it's just a prototype). What's the best approach to doing something like this quickly and easily? Should I be hacking on Clang to spit out the information I need? Should I look at parsing header files with something like SWIG? Or is there an even easier thing I could be doing?
GCCXML, based on GCC, might be the ticket.
As I understand it, it collects and dumps all definitions but not the content of functions/methods.
Others will likely mention CLANG, which certainly parses code and must have access to the definitions of the symbols in a compilation unit. (I have no experience here).
For completeness, you should know about our DMS Software Reengineering Toolkit
with its C++ Front End. (The CLANG answers seem to say "walk the AST"). The DMS solution provides an enumerable symbol table containing all the type information. You can walk the AST, too, if you want.
Often a static analysis leads to a diagnosis, and a desire to change the source code.
DMS can apply source-to-source program transformations to carry out such changes conditioned
by the analysis.
I heartily recommend LLVM for statical analysis (see also Clang Static Analyzer)
I think your best bet is hacking on clang and getting the AST. There is a good tutorial for that here. Its very easy to modify its syntax and it also has a static analyzer.
At my work, I use the API from a software package called "Understand 4 C++" by scitools. I use this to write all my static analysis tools. I even wrote a .NET API to wrap their C API. Which I put on codeplex.
Once you have that, dumping all class types is easy:
ClassType[] allclasses = Database.GetAllClassTypes()
foreach (ClassType c in allclasses)
{
Console.WriteLine("Class Name: {0}", c.NameLong);
}
Now for a little backstory about a task I had that is similar to yours.
In some years we have to keep our SDK binary backwards compatible with the previous years SDK. In that case it's useful to compare the SDK code between releases to check for potential breaking changes. However with a couple of hundred files, and tens of thousands of lines of comments that can be a big headache using a text diff tool like Beyond Compare or Araxis. So what I really need to look at is actual code changes, not re-ordering, not moving code up and down in the file, not adding comments etc...
So, a tool I wrote to dump out all the code.
In one text file I dump all all the classes. For each class I print its inheritance tree, its member functions both virtual and non-virtual. For each virtual function I print what parent class virtual methods it overrides (if any). I also print out its member variables.
Same goes with the structs.
In another file I print all the macro's.
In another file I print all the typedefs.
Then using this I can diff these files with files from a previous release. It then becomes apparent instantly what has changed from release to release. For instance it's easy to see where a function parameter was changed from TCHAR* to const TCHAR* for instance.
You might consider developping a GCC Plugin for your purpose.
And GCC MELT is a high level domain specific language (that I designed & implemented) for easily extend GCC.
The paper at GROW09 workshop by Peter Collingbourne and Paul Kelly on a A Compile-Time Infrastructure for GCC Using Haskell might be relevant for your work.

Script system in application

I'm developing a game and now I want to make script system for it.
Now I have abstract class Object which is inherited by all game objects. I have to write a lot of technical code, add new object type into enum, register parser function for each object (that function parses object's params from file).
I don't want to make such work. So the idea is to get some script system (boost.python for example, because I'm using boost in my project). Each object will be a simple python-script, at c++ side I just load and run all that scripts.
Python isn't hard -typed so I can register functions, build types dynamically without storing enum, etc. The only bad part is writing a lot of binding-code but It makes only once.
Are my ideas right?
Can you give us a rough idea of how large the game is going to be?
If you're not careful, you could give yourself a lot of extra work without much benefit, but with some planning it sounds like it might help. The important questions are "What parts of the program do I want to simplify?", "Do I need a scripting language to simplify them? and "Can the scripting language simplify them?".
You mentioned that you don't want to have to manually parse files. Python's pickle module could handle serialization for you, but so could .NET. If you're using Visual Studio, then you may find it easier to write the code in C# than in Python.
You should also look for ways to simplify your code without adding a new language. For example, you might be able to create a simple binary file format and store your data structures without much parsing. There are probably other things you can do, but that would require more detailed knowledge of the program.

Any tutorial for embedding Clang as script interpreter into C++ Code?

I have no experience with llvm or clang, yet. From what I read clang is said to be easily embeddable Wikipedia-Clang, however, I did not find any tutorials about how to achieve this. So is it possible to provide the user of a c++ application with scripting-powers by JIT compiling and executing user-defined code at runtime? Would it be possible to call the applications own classes and methods and share objects?
edit: I'd prefer a C-like syntax for the script-languge (or even C++ itself)
I don't know of any tutorial, but there is an example C interpreter in the Clang source that might be helpful. You can find it here: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
You probably won't have much of a choice of syntax for your scripting language if you go this route. Clang only parses C, C++, and Objective C. If you want any variations, you may have your work cut out for you.
I think here's what exactly you described.
http://root.cern.ch/drupal/content/cling
You can use clang as a library to implement JIT compilation as stated by other answers.
Then, you have to load up the compiled module (say, an .so library).
In order to accomplish this, you can use standard dlopen (unix) or LoadLibrary (windows) to load it, then use dlsym (unix) to dynamically reference compiled functions, say a "script" main()-like function whose name is known. Note that for C++ you would have to use mangled symbols.
A portable alternative is e.g. GNU's libltdl.
As an alternative, the "script" may run automatically at load time by implementing module init functions or putting some static code (the constructor of a C++ globally defined object would be called immediately).
The loaded module can directly call anything in the main application. Of course symbols are known at compilation time by using the proper main app's header files.
If you want to easily add C++ "plugins" to your program, and know the component interface a priori (say your main application knows the name and interface of a loaded class from its .h before the module is loaded in memory), after you dynamically load the library the class is available to be used as if it was statically linked. Just be sure you do not try to instantiate a class' object before you dlopen() its module.
Using static code allows to implement nice automatic plugin registration mechanisms too.
I don't know about Clang but you might want to look at Ch:
http://www.softintegration.com/
This is described as an embeddable or stand-alone c/c++ interpreter. There is a Dr. Dobbs article with examples of embedding it here:
http://www.drdobbs.com/architecture-and-design/212201774
I haven't done more than play with it but it seems to be a stable and mature product. It's commercial, closed-source, but the "standard" version is described as free for both personal and commercial use. However, looking at the license it seems that "commercial" may only include internal company use, not embedding in a product that is then sold or distributed. (I'm not a lawyer, so clearly one should check with SoftIntegration to be certain of the license terms.)
I am not sure that embedding a C or C++ compiler like Clang is a good idea in your case. Because the "script", that is the (C or C++) code fed (at runtime!) can be arbitrary so be able to crash the entire application. You usually don't want faulty user input to be able to crash your application.
Be sure to read What every C programmer should know about undefined behavior because it is relevant and applies to C++ also (including any "C++ script" used by your application). Notice that, unfortunately, a lot of UB don't crash processes (for example a buffer overflow could corrupt some completely unrelated data).
If you want to embed an interpreter, choose something designed for that purpose, like Guile or Lua, and be careful that errors in the script don't crash the entire application. See this answer for a more detailed discussion of interpreter embedding.