Python non-trivial C++ Extension - c++

I have fairly large C++ library with several sub-libraries that support it, and I need to turn the whole thing into a python extension. I'm using distutils because it needs to be cross-platform, but if there's a better tool I'm open to suggestions.
Is there a way to make distutils first compile the sub-libraries, and link them in when it creates an extension from the main library?

I do just this with a massive C++ library in our product. There are several tools out there that can help you automate the task of writing bindings: the most popular is SWIG, which has been around a while, is used in lots of projects, and generally works very well.
The biggest thing against SWIG (in my opinion) is that the C++ codebase of SWIG itself is really rather crufty to put it mildly. It was written before the STL and has it's own semi-dynamic type system which is just old and creaky now. This won't matter much unless you ever have to get stuck in and make some modifications to the core (I once tried to add doxygen -> docstring conversion) but if you ever do, good luck to you! People also say that SWIG generated code is not that efficient, which may be true but for me I've never found the SWIG calls themselves to be enough of a bottleneck to worry about it.
There are other tools you can use if SWIG doesn't float your boat: boost.python is also popular and could be a good option if you already use boost libraries in your C++ code. The downside is that it is heavy on compile times since it is pretty much all c++ template based.
Both these tools require you to do some work up-front in order to define what will be exposed and quite how it will be done. For SWIG you provide interface files which are like C++ headers but stripped down, and with some extra directives to tell SWIG how to translate complex types etc. Writing these interfaces can be tedious, so you may want to look at something like pygccxml to help you auto-generate them for you.
The author of that package actually wrote another extension which you might like: py++. This package does two things: it can autogenerate binding definitions that can then be fed to boost.python to generate python bindings: basically it is the full solution for most people. You might want to start there if you no particulrly special or difficult requirements to meet.
Some other questions that might prove useful as a reference:
Extending python - to swig or not to swig
SWIG vs CTypes
Extending Python with C/C++
You may also find this comparison of binding generation tools for Python handy. As Alex points out in the comments though, its rather old now but at least gives you some idea of the landscape...
In terms of how to drive the build, you may want to look at a more advanced built tool that distutils: if you want to stick with Python I would highly recommend Waf as a framework (others will tell you SCons is the way to go, but believe me it's slow as hell: I've been there and back already!)...it takes a little learning, but when you get your head around it is extremely powerful. And since it's pure Python it will integrate perfectly with any other Python code you have as part of your build process (say for example you use Py++ in the end)...

Related

Is there a C-like syntax scripting language interpreter for C++?

I've started long ago to work on a dynamic graph visualizer, editor and algorithm testing platform (graphs with nodes and arcs, not the other kinds).
For the algorithm testing platform i need to let the user write a script or call a script from a file, which will interact with the graph currently loaded. The visualizer would do things like light up nodes while they're being visited by the script algorithm, adding some artificial delay, in order to visualize the algorithm navigating and doing stuff.
Scripts would also be secondly used to add third party features that i could either make available as pre-existing scripts in the program folder OR just integrate inside the program in c++ once they're tested and working.
All my searches for an interpreter to embed in my program sent me to lua;
then i started handwriting my own recursive descent parser for my own C-like syntax scripting language (which i planned to use a subset of C++ grammar so that any code written in my scripting language can be copy-pasted in any C++ code.
It was an interesting crazy idea which i don't regret at all, I have scopes, functions, cycles, gotos, typesafe variables, expressions.
But now that i'm approaching the addition of classes, class methods, inheritance (some default classes would be necessary to interface scripts to the program), i realized it's going to take A LOT of time and effort. A bit too much for a personal project of an ungraduated student with exams to study for… but still i whish to complete this project.
The self-imposed requirement of the scripts being 100% compatible with C++ was all but necessary, it would have been just a little nice extra thing, which i can do without.
Now the question is, is there an alternative to lua with a c-like syntax that supports all i've already done plus classes and inheritance? (being able to add custom "classes" that interface scripts to the program is mandatory)
(i can't assume the user to have a full c++ compiler installed so i cant just compile their "script" at runtime as a dll to load and call it, although i whish i could)
Just-in-time compilation of C++
Parsing C++ is hard. Heck, parsing C is hard. It's difficult to get it right, and there are a lot of edge cases. Thankfully, there are a few libraries out there which can take code and even compile it for you.
libclang
libclang provides a lot of facilities for parsing c++. It's a good, clean library, and it'll parse anything the clang compiler itself will parse. This article here is a good starter
libclang provides a JIT compilation tool that allows you to write and compile C++ at runtime. See this blog post here for a overview of what it does and how to use it. It's very general, very powerful, and user-written code should be fast.
GCC also provides a library called libgccjit for just-in-time compilation during the runtime of a program. libgccjit is a C library, but there's also a C++ wrapper provided by the library maintainers. It can compile abstract syntax trees and link them at runtime, although it's still in Alpha mode.
cppast
If you don't want to use libclang, there's also a library under development called cppast, which is a C++ parser which will give you an abstract syntax tree representation of your c++ code. Unfortunately, it won't parse function bodies.
Other tools
If anyone knows any other libraries for compiling or interpreting C++ at runtime, I encourage them to update this post, or comment them so I can update it!
Here is something that lets you embed a C-like scripting language in your application (and a bunch of other cool things):
http://chaiscript.com/
There is lots of documentation:
https://codedocs.xyz/ChaiScript/ChaiScript/

C++-based make system

Are there any build systems that don't use a DSL but actually use C++ as the build language?
Yo Dawg, I heard you like C++, so I added C++ to your build system, so you have to compile before you compile.
I've written a build system that I use in my projects in Python called pybake. It's designed to be a bit smarter than make, with less magic. The build is also defined in Python, thereby reusing an existing language, rather than generating a new DSL for that purpose. Here's an example of it in use.
None that are popular, if anyone was crazy enough to even write one. C++ would be an incredibly clumsy language for that.
If you're looking to create one, instead pick a language such as Python or Lua in order to use something popular and not invent a new DSL.
Are you asking if there are any build systems, like Make or Ant that use C++ code as the directives rather than specialized commands? While many higher level languages have such a system, there aren't any in C++ that I am aware of. Certainly not the popular ones. This is probably because C++ is a compiled language and not one that is trivial to parse. This makes it less suitable for what is essentially a lightweight scripting task.
There is probably C++ code in there somewhere but if you mean you would have to write a C++ program and compile it then run it to build a different source tree, I don't think that would really work in the scheme of things. What would you build your build script with? It goes on and on.
Compilers and the commands behind the scripting languages are often written in C or C++.
Since C++ is compiled, this would give you the problem of needing a build system for your C++-based build system. I am not aware of any C++-based build system and would be surprised to find one. For interpreted languages like Python, bootstrapping isn't an issue (and so you will find scons, for example).
If you are looking for something better than Make, though, check out CMake. Though somewhat out-of-date, you may find the C++ Project Template helpful as an example for creating CMake-based projects.

Is there a library that can compile C++ or C

I came here to ask this question because this site has been very useful to me in the past, seems to have very knowledgeable users who are willing to discuss a question even if it is metaphysical at times. And also because googling it did not work.
Java has a compiler and then it has a JDT library that can compile java on the fly (for example used in JasperReports to turn a report template into Java code).
My question: Does anyone know of a library/project that would offer compiling as a set of library classes in c/c++. For example: a suite of classes to perform Preprocessing, Parsing, CodeOptimization and of course Binary rendering to executable images such as ELF or Win format. Basically something that would allow one to compile c or c++ scriptlets as part of an application.
Yes: llvm. In particular, clang. At least, that's how they advertise the projects. Also, check this question. It might be relevant if you decide to use llvm.
You might be able to adapt something from the LLVM project to your needs.
You can just require that a compiler be installed, then call it. This is a fairly hefty requirement, but about the only way to truly "embed" C or C++. There are interpreters that you may be able to embed, but that currently seems a poor choice, not the least because any libraries used in the script must have development versions (i.e. headers and source/compiled-libraries) installed, and those libraries could be restricted to the feature set supported by the interpreter (e.g. quality of template implementation).
You're better off using a language like Python or Lua to embed.
There is the ch interpreter but I have not used it. Generally for scripting type applications a more natural scripted language is used.
Great. It looks like LLVM is what I was after.
Thanks a lot for your feedback.
I am not primarily after C++ as a scripting language. I have noticed that Python is used as an embedded script engine.
My primary reason is two fold:
Get rid off Make,CMake and the hell that is Autoconf and replace it with something like Scons that binds into and interacts with all phases of compiling
Hook into the compiling process after parsing and auto generate code. Specificaly meta related code. In my case I have been able to implement almost every feature of Java in C++ except one: Reflection.
Why impose on your code uneeded bload like RTTI that is often inadequate. Instead one could selectively generate added features. But developer would have to choice when and how to use this extra code.

Integrate Python And C++

I'm learning C++ because it's a very flexible language. But for internet things like Twitter, Facebook, Delicious and others, Python seems a much better solution.
Is it possible to integrate C++ and Python in the same project?
Interfacing Python with C/C++ is not an easy task.
Here I copy/paste a previous answer on a previous question for the different methods to write a python extension. Featuring Boost.Python, SWIG, Pybindgen...
You can write an extension yourself in C or C++ with the Python C-API.
In a word: don't do that except for learning how to do it. It's very difficult to do it correctly. You will have to increment and decrement references by hand and write a lot of code just to expose one function, with very few benefits.
Swig:
pro: you can generate bindings for many scripting languages.
cons: I don't like the way the parser works. I don't know if they've made some progress but two years ago the C++ parser was quite limited. Most of the time I had to copy/paste my .h headers to add some % characters and to give extra hints to the swig parser.
I also needed to deal with the Python C-API from time to time for (not so) complicated type conversions.
I'm not using it anymore.
Boost.Python:
pro:
It's a very complete library. It allows you to do almost everything that is possible with the C-API, but in C++. I never had to write a C-API code with this library. I also never encountered a bug due to the library. Code for bindings either works like a charm or refuses to compile.
It's probably one of the best solutions currently available if you already have some C++ library to bind. But if you only have a small C function to rewrite, I would probably try with Cython.
cons: if you don't have a precompiled Boost.Python library you're going to use Bjam (sort of a replacement of make). I really hate Bjam and its syntax.
Python libraries created with B.P tend to become obese. It also takes a lot of time to compile them.
Py++: it's Boost.Python made easy. Py++ uses a C++ parser to read your code and then generates Boost.Python code automatically. You also have a great support from its author (no it's not me ;-) ).
cons: only the problems due to Boost.Python itself.
Edit this project looks discontinued. While probably still working it may be better to consider switching.
Pybindgen:
It generates the code dealing with the C-API. You can either describe functions and classes in a Python file, or let Pybindgen read your headers and generate bindings automatically (for this it uses pygccxml, a python library wrote by the author of Py++).
cons: it's a young project, with a smaller team than Boost.Python. There are still some limitations: you cannot expose your own C++ exceptions, you cannot use multiple inheritance for your C++ classes.
Anyway it's worth trying!
Pyrex and Cython:
Here you don't write real C/C++ but a mix between Python and C. This intermediate code will generate a regular Python module.
Edit Jul 22 2013: Now Py++ looks discontinued, I'm now looking for a good alternative. I'm currently experimenting with Cython for my C++ library. This language is a mix between Python and C. Within a Cython function you can use either Python or C/C++ entities (functions, variables, objects, ...).
Cython is quite easy to learn, has very good performance, and you can even avoid C/C++ completely if you don't have to interface legacy C++ libraries.
However for C++ it comes with some problems. It is less "automagic" than Py++ was, so it's probably better for stable C++ API (which is now the case of my library). The biggest problem I see with Cython is with C++ polymorphism. With Py++/boost:python I was able to define a virtual method in C++, override it in Python, and have the Python version called within C++. With Cython it's still doable but you need to explicitly use the C-Python API.
Edit 2017-10-06:
There is a new one, pybind11, similar to Boost.Python but with some potential advantages. For example it uses C++11 language features to make it simpler to create new bindings. Also it is a header-only library, so there is nothing to compile before using it, and no library to link.
I played with it a little bit and it was indeed quite simple and pleasant to use. My only fear is that like Boot.Python it could lead to long compilation time and large libraries. I haven't done any benchmark yet.
Yes, it is possible, encouraged and documented. I have done it myself and found it to be very easy.
Python/C API Reference Manual - the API used by C and C++ programmers who want to write extension modules or embed Python.
Extending and Embedding the Python Interpreter
describes how to write modules in C or C++ to extend the Python interpreter with new modules. Those modules can define new functions but also new object types and their methods. The document also describes how to embed the Python interpreter in another application, for use as an extension language. Finally, it shows how to compile and link extension modules so that they can be loaded dynamically (at run time) into the interpreter, if the underlying operating system supports this feature.
Try Pyrex. Makes writing C++ extensions for Python easier.
We use swig very successfully in our product.
Basically swig takes your C++ code and generates a python wrapper around it.
I'd recommend looking at how PyTorch does their integration.
See this:
Extending Python with C or C++
"It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can't be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls.
To support extensions, the Python API (Application Programmers Interface) defines a set of functions, macros and variables that provide access to most aspects of the Python run-time system. The Python API is incorporated in a C source file by including the header "Python.h". "
http://www.python.org/doc/2.5.2/ext/intro.html
PS It's spelt "integrate" :)
I've used PyCxx http://cxx.sourceforge.net/ in the past and i found that it was very good.
It wraps the python c API in a very elegant manner and makes it very simple to use.
It is very easy to write python extension in c++. It is provided with clear examples so it is easy to get started.
I've really enjoyed using this library and I do recommend it.
It depends on your portability requirements. I've been struggling with this for a while, and I ended up wrapping my C++ using the python API directly because I need something that works on systems where the admin has only hacked together a mostly-working gcc and python installation.
In theory Boost.Python should be a very good option, since Boost is available (almost) everywhere. Unfortunately, if you end up on a OS with an older default python installation (our collaboration is stuck with 2.4), you'll run into problems if you try to run Boost.Python with a newer version (we all use Python 2.6). Since your admin likely didn't bother to install a version of Boost corresponding to every python version, you'll have to build it yourself.
So if you don't mind possibly requiring some Boost setup on every system your code runs on, use Boost.Python. If you want code that will definitely work on any system with Python and a C++ compiler, use the Python API.
Another interesting way to do is python code generation by running python itself to parse c++ header files. OpenCV team successfully took this approach and now they have done exact same thing to make java wrapper for OpenCV library. I found this created cleaner Python API without limitation caused by a certain library.
You can write Python extensions in C++. Basically Python itself is written in C and you can use that to call into your C code. You have full access to your Python objects. Also check out Boost.Python.

Exposing a C++ API to Python

I'm currently working on a project were I had to wrap the C++ classes with Python to be able to script the program. So my specific experience also involved embedding the Python interpreter in our program.
The alternatives I tried were:
Boost.Python
I liked the cleaner API produced by Boost.Python, but the fact that it would have required that users install an additional dependency made us switch to SWIG.
SWIG
SWIG's main advantage for us was that it doesn't require end users to install it to use the final program.
What have you used to do this, and what has been your experience with it?
I've used both (for the same project): Boost is better integrated with the STL, and especially C++ exceptions. Also, its memory management mechanism (which tries to bridge C++ memory management and Python GC) is way more flexible than SWIG's. However, SWIG has much better documentation, no external dependencies, and if you get the library wrapped in SWIG for Python you're more than half-way there to getting a Java/Perl/Ruby wrapper as well.
I don't think there's a clear-cut choice: for smaller projects, I'd go with Boost.Python again, for larger long-lived projects, the extra investment in SWIG is worth it.
EDIT - the Robin project is sadly abandoned, and won't be much use today
I've used Robin with great success.
Great integration with C++ types, and creates a single .cpp file to compile and include in your shared object.
I suggest SIP. SIP is better than SWIG due to the following reasons:
For a given set of files, swig generates more duplicate (overhead) code than SIP. SIP manages to generate less duplicate (overhead) code by using a library file which can be statically or dynamically linked. In other words SIP has better scalability.
Execution time of SIP is much less than that of SWIG. Refer Python Wrapper Tools: A Performance Study. Unfortunately link appears broken. I have a personal copy which can be shared on request.
pyrex or cython are also good and easy ways for mixing the two worlds.
Wrapping C++ using these tools is a bit trickier then wrapping C but it can be done. Here is the wiki page about it.
A big plus for Boost::Python is that it allows for tab completion in the ipython shell: You import a C++ class, exposed by Boost directly, or you subclass it, and from then on, it really behaves like a pure Python class.
The downside: It takes so long to install and use Boost that all the Tab-completion time-saving won't ever amortize ;-(
So I prefer Swig: No bells and whistles, but works reliably after a short introductory example.