What does JavaVMInitArgs.version really mean? - java-native-interface

I am trying to load a jvm from C++ via JNI and I originally mindlessly set my JavaVMInitArgs.version to "JNI_VERSION_1_6" without really knowing what it means.
Later, I installed java 8, modified my makefile to link the new libjvm.so and include the new jni.h, and changed my version to "JNI_VERSION_1_8" and now my call to JNI_CreateJavaVM returns -3 (JNI_EVERSION).
I switched back to JNI_VERSION_1_6 and it loaded fine. I checked the version number from my JNIEnv object and saw that it still said 1.6. Just out of curiosity, I tried again with JNI_VERSION_1_4 and found that not only did it still load fine, but that the version still was 1.6.
So, it appears that my executable is still pointing to the java 1.6 version of libjvm.so for some reason and that is probably some issue with one of my makefiles. I'll debug that on my own.
The real question I have for the people of stackoverflow is "what on earth does JavaVMInitArgs.version really even mean?"
I'm going under the assumption that the JNI versions correspond with java versions (so JNI_VERSION_1_8 is somehow related to JRE8) but I'm not clear on exactly how that affects what gets loaded or how it gets used.
My thoughts are that perhaps the version you specify indicates the minimum version of java that you need in order to run your program, so if you specify JNI_VERSION_1_X then you can load any JVM compatible with Java Y as long as Y >= X?
Also, does the JNI version only dictate the version required for C++'s interaction with the Java code, or does it dictate what the version of the java code itself? In other words, lets say my Java code does some stuff that requires Java 7, but my C++ code is isolated from that and only calls Java 4 compatible stuff, then can I set my JNI version to 1_4 and link my program with a Java7 version of libjvm.so?
I realize that I kind of asked a lot of questions at once, but if anyone can give me a description of how this works, I would be most grateful. And of course if anyone has an idea of why I can't seem to load Java8, I'd love to hear your advice as well.
Edit
I figured out what why I couldn't seem to properly link in the Java 8 version of libjni.so. I had correctly added the new library to my -L g++ argument in my makefile, but my environment variable LD_LIBRARY_PATH (which apparently g++ checks first) still pointed to the old libjni.so. I set my LD_LIBRARY_PATH to point to the correct library (I probably also could have just removed it) and now it works great.
I'm still interested to know what exactly the JNI_VERSION values mean though.

JNI_VERSION specifies the version of JNI interface. It may happen that for given version, JNI API is slightly different comparing to other one.
Example:
#ifdef JNI_VERSION_1_1
JDK1_1InitArgs vm_args;
...
...
#else
JavaVMInitArgs vm_args;
...
...
#endif /* JNI_VERSION_1_1 */
Source: Java™ Native Interface: Programmer’s Guide and Specification, The
You can get version number supported by the library (you are linking to) using: https://docs.oracle.com/javase/9/docs/specs/jni/functions.html#getversion
You can also look for the enhancements inside JNI interface for given JDK release:
https://www.oracle.com/technetwork/java/javase/13all-relnotes-5461743.html

Related

Converting LLVM IR to Java Bytecode

I am beginner and want to build translator that can convert LLVM bitcode to Java Bytecode.
Can somebody please tell me in brief or list some major steps how to go through it.
In our company (Altimesh), we did the same thing for CIL. For Java Bytecode, the task is likely very similar.
I can tell you it's quite a long task.
First thing : LLVM libraries are written in C++
That means you either have to learn c++, and a way to generate java bytecode from C++, or export the symbols you need from LLVM libraries to JNI. I strongly recommend the second option, as you'll get a pure Java implementation (and you'll soon figure out that you don't need that many symbols from LLVM API).
Once you figured that out, you need to:
Parse modules from files
here is a simple example (using llvm 3.9 API, which is quite old now):
llvm::Module* llvm__Module_buildFromFile (llvm::LLVMContext* context, const char* filename)
{
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> buf = llvm::MemoryBuffer::getFile(filename);
llvm::SMDiagnostic diag;
return llvm::parseIR(buf->get()->getMemBufferRef(), diag, *context).release();
}
Parse debug infos
void llvm__DebugInfoFinder__processModule(llvm::DebugInfoFinder* self, llvm::Module* M)
{
self->processModule(*M);
}
debug info, or metadata, are quite a pain with llvm, as they change very frequently (compared to instructions). So you either have to stick to an LLVM's version (probably a bad choice), or update your code as soon as a new LLVM release gets out.
Once you're there, most of the pain is behind you, and you enter the world of fun.
I strongly recommend to start with something very very simple, such as a simple addition program.
Then always keep two windows opened, godbolt showing you input llvm you need to parse, and a java window showing you the target (here is an example for MSIL).
Once you're able to transpile your first program (hurrraah, I can add two integers :) ), you will soon want to transpile more stuff, and soon you will face two insanities:
getelementptr. This is how arrays, memory, structures... is accessed in LLVM. This is a pretty magic instruction.
phi. Crucial instruction in LLVM system, as it allows Single Static Assignment, which is fairly important for the backend (register allocator and co). I don't know in Java, but this was obviously not available in MSIL.
Once all of that is done, you enter the endless world of pain of special cases, weird C constructs you didn't know about, gcc extensions and so on...
Anyway good luck!

How to give someone a c/c++ programme without giving them the source

I want to give a C++ programme to someone for testing but I don't want them to see the source just yet. My main issues are that I don't know what platform that person is using and I don't want to create a shared library unless I have no other option. Ideally, I would like to send headers and object files for the person to compile and link him/herself but as far as I know that would only work if the person has the same set up that I have.
I am currently using Windows but I'm comfortable working on any Unix-like system as well and I am not using an IDE, in case you need that information
Well, a Windows development environment allows you to bind some native always backward compatible winapi functions. The distribution of correctly setup binary .dll files, along with consistent headers, is enough.
For Linux distributions, the scenario is different, since you need to have a distributed package compiled from source (that's disclosed), or distributed binaries for every Linux distributions you actually want to support.
If you want to avoid source code disclosure, where it's needed to compile on specific target systems, use a licencing mechanism that's preventing to run it.
Assuming the choice of machine is "reasonable" - in other words, it's something running Linux, Windows, Android or MacOS and a reasonable target processor such as MIPS, Sparc, x86 or ARM, then one POSSIBLE solution is to use clang -S -emit-llvm yourfile.cpp to produce an intermediate form of the LLVM "virtual machine code". This can then, using llc, be compiled to machine code for any target that LLVM supports.
It's not completely impossible to figure out roughly what the source code looked like, but unless someone wants to put a LOT of effort into running your code, they won't be able to see what the code does. And even giving someone a binary allows them, if they are that way inclined, to reverse engineer the code.
The other alternative, as I see it, is that you demonstrate the code on your machine [or a machine under your control].
There are also tools that can "obfuscate" source-code (rename variables, structure/class members and functions to a, b, c; remove any comments; and "unformat" the code - all of which makes it much harder to understand what the code does). Sorry, you'll have to google to find a good one, as I have never used such a thing myself. And again, of course, it's not IMPOSSIBLE to recover the code into something that can be used and modified and rebuilt. There is really no way to avoid giving the customer something they can compile unless you know what OS/processor it is for.

Embed C++ compiler in application

Aren't shaders cool? You can toss in just a plain string and as long as it is valid source, it will compile, link and execute. I was wondering if there is a way to embed GCC inside a user application so that it is "self sufficient" e.g. has the internal capability to compile native binaries compatible to itself.
So far I've been invoking stand alone GCC from a process, started inside the application, but I was wondering if there is some API or something that could allow to use "directly" rather than a standalone compiler. Also, in the case it is possible, is it permitted?
EDIT: Although the original question was about CGG, I'd settle for information how to embed LLVM/Clang too.
And now a special edit for people who cannot put 2 + 2 together: The question asks about how to embed GCC or Clang inside of an executable in a way that allows an internal API to be used from code rather than invoking compilation from a command prompt.
I'd add +1 to the suggestion to use Clang/LLVM instead of GCC. A few good reasons why:
it is more modular and flexible
compilation time can be substantially lower than GCC
it supports the platforms you listed in the comments
it has an API that can be used internally
string source = "app.c";
string target= "app";
llvm::sys::Path clangPath = llvm::sys::Program::FindProgramByName("clang");
// arguments
vector<const char *> args;
args.push_back(clangPath.c_str());
args.push_back(source.c_str());
args.push_back("-l");
args.push_back("curl");
clang::TextDiagnosticPrinter *DiagClient = new clang::TextDiagnosticPrinter(llvm::errs(), clang::DiagnosticOptions());
clang::IntrusiveRefCntPtr<clang::DiagnosticIDs> DiagID(new clang::DiagnosticIDs());
clang::DiagnosticsEngine Diags(DiagID, DiagClient);
clang::driver::Driver TheDriver(args[0], llvm::sys::getDefaultTargetTriple(), target, true, Diags);
clang::OwningPtr<clang::driver::Compilation> c(TheDriver.BuildCompilation(args));
int res = 0;
const clang::driver::Command *FailingCommand = 0;
if (c) res = TheDriver.ExecuteCompilation(*c, FailingCommand);
if (res < 0) TheDriver.generateCompilationDiagnostics(*c, FailingCommand);
Yes, it is possible, for example, QEMU does it.
I don't have any personal experience in this field, but from what I've read, it seems that LLVM might be better suited for embedding and extending than GCC.
Some older list of C++ compilers and interpreters is available at http://www.thefreecountry.com/compilers/cpp.shtml.
Answer to the "self sufficient" application is usually a good language interpreter. There are many of them out there, many compile the code into a byte code files. Very popular and easily embeddable is the Lua language interpreter. Even some strong players use it.
There was also an open source C++ interpreter with great language compatibility produced years ago starting with F.. Don't remember the rest of the name. There are also many other tools able to produce native binaries (e.g. Free Pascal).
Choice of the language and the target platform depends on the intentions. What would be the "self sufficiency" good for. Who will write those libraries. Once you have that clear - use Google - there is a wildlife out there. One of the latest beasts is the open sourced C# compiler "Roslyn"
EDIT
If you need some C compiler (as you generate C subset) that can be "embedded" you are probably looking for a "portable C compiler" in the sense that you can put it on USB stick and carry with you. Portable applications can be easily "embedded" into other applications and can be easily included in the installer.
Possibility to "embed" compiler as statically linked code into main application binary is probably not required.
Some reference to portable MinGW is described in this https://stackoverflow.com/questions/7617410/portable-c-compiler-ide SO question.
An open source C++ editor with integrated MinGW is here https://code.google.com/p/pocketcpp/.
I don't have anything more to say as I'd have to go and browse Google - so I will not win the bounty :)
Why not just call the compiler and linker from your application using fork()/exec() (for UNIX-like platforms)? Create a shared library that you can then load with dlopen().
This avoids possible licensing issues and gives you less of a maintenance burden.
This is e.g. what varnish does with its configuration files;
The VCL language is a small domain-specific language designed to be used to define request handling and document caching policies for Varnish Cache.
When a new configuration is loaded, the varnishd management process translates the VCL code to C and compiles it to a shared object which is then dynamically linked into the server process.

C++ compilation at runtime

So I watched this video months back where Epic showcased their new Unreal Engine 4 developer features. Sorry I could not find the video but I'll try my best to explain.
The one feature that got my attention was the C++ "on the fly" modification and compilation. The guy showed how he was playing the game in editor and modified some variables in the code, saved it and the changes were immediately mirrored in game.
So I've been wondering... How does one achieve this? Currently I can think of two possible ways: either a hoax and it was only "c style"-scripting language not C++ itself OR it's shared library (read: DLL) magic.
Here's something I whipped up to try myself(simplified):
for(;;)
{
Move DLL from build directory to execution directory
Link to DLL using LoadLibrary() / dlopen() and fetch a pointer to function "foo()" from it
for(;;)
{
Call the function "foo()" found in the dll
Check the source of the dll for changes
If it has changed, attempt to compile
On succesfull compile, break
}
Unload DLL with FreeLibrary() / dlclose()
}
Now that seems to work but I can't help but wonder what other methods are there?
And how would this kind of approach compare versus using a scripting language?
edit: https://www.youtube.com/watch?v=MOvfn1p92_8&t=10m0s
Yes, "hot code modification" it's definitely a capability that many IDEs/debuggers can have to one extent or another. Here's a good article:
http://www.technochakra.com/debugging-modifying-code-at-runtime/
Here's the man page for MSVS "Edit and Continue":
http://msdn.microsoft.com/en-us/library/esaeyddf%28v=vs.80%29.aspx
Epic's Hot Reload works by having the game code compiled and loaded as a dll/.so, and this is then dynamically loaded by the engine. Runtime reloading is then simply re-compiling the library and reloading, with state stored and restored if needed.
There are alternatives. You could take a look at Runtime Compiled C++ (or see RCC++ blog and videos), or perhaps try one of the other alternatives I've listed on the wiki.
As the author of Runtime Compiled C++ I think it has some advantages - it's far faster to recompile since it only compiles the code you need, and the starting point can be a single exe, so you don't need to configure a seperate project for a dll. However it does require some learning and a bit of extra code to get this to work.
Neither C nor C++ require ahead-of-time compilation, although the usual target environments (operating systems, embedded systems, high-performance number crunching) often benefit greatly from AOT.
It's completely possible to have a C++ script interpreter. As long as it adheres to the behavior in the Standard, it IS C++, and not a "hoax" as you suggest.
Replacement of a shared library is also possible, although to make it work well you'll need a mechanism for serializing all state and reloading it under the new version.

Any tutorial for embedding Clang as script interpreter into C++ Code?

I have no experience with llvm or clang, yet. From what I read clang is said to be easily embeddable Wikipedia-Clang, however, I did not find any tutorials about how to achieve this. So is it possible to provide the user of a c++ application with scripting-powers by JIT compiling and executing user-defined code at runtime? Would it be possible to call the applications own classes and methods and share objects?
edit: I'd prefer a C-like syntax for the script-languge (or even C++ itself)
I don't know of any tutorial, but there is an example C interpreter in the Clang source that might be helpful. You can find it here: http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/clang-interpreter/
You probably won't have much of a choice of syntax for your scripting language if you go this route. Clang only parses C, C++, and Objective C. If you want any variations, you may have your work cut out for you.
I think here's what exactly you described.
http://root.cern.ch/drupal/content/cling
You can use clang as a library to implement JIT compilation as stated by other answers.
Then, you have to load up the compiled module (say, an .so library).
In order to accomplish this, you can use standard dlopen (unix) or LoadLibrary (windows) to load it, then use dlsym (unix) to dynamically reference compiled functions, say a "script" main()-like function whose name is known. Note that for C++ you would have to use mangled symbols.
A portable alternative is e.g. GNU's libltdl.
As an alternative, the "script" may run automatically at load time by implementing module init functions or putting some static code (the constructor of a C++ globally defined object would be called immediately).
The loaded module can directly call anything in the main application. Of course symbols are known at compilation time by using the proper main app's header files.
If you want to easily add C++ "plugins" to your program, and know the component interface a priori (say your main application knows the name and interface of a loaded class from its .h before the module is loaded in memory), after you dynamically load the library the class is available to be used as if it was statically linked. Just be sure you do not try to instantiate a class' object before you dlopen() its module.
Using static code allows to implement nice automatic plugin registration mechanisms too.
I don't know about Clang but you might want to look at Ch:
http://www.softintegration.com/
This is described as an embeddable or stand-alone c/c++ interpreter. There is a Dr. Dobbs article with examples of embedding it here:
http://www.drdobbs.com/architecture-and-design/212201774
I haven't done more than play with it but it seems to be a stable and mature product. It's commercial, closed-source, but the "standard" version is described as free for both personal and commercial use. However, looking at the license it seems that "commercial" may only include internal company use, not embedding in a product that is then sold or distributed. (I'm not a lawyer, so clearly one should check with SoftIntegration to be certain of the license terms.)
I am not sure that embedding a C or C++ compiler like Clang is a good idea in your case. Because the "script", that is the (C or C++) code fed (at runtime!) can be arbitrary so be able to crash the entire application. You usually don't want faulty user input to be able to crash your application.
Be sure to read What every C programmer should know about undefined behavior because it is relevant and applies to C++ also (including any "C++ script" used by your application). Notice that, unfortunately, a lot of UB don't crash processes (for example a buffer overflow could corrupt some completely unrelated data).
If you want to embed an interpreter, choose something designed for that purpose, like Guile or Lua, and be careful that errors in the script don't crash the entire application. See this answer for a more detailed discussion of interpreter embedding.