Which version of Java should I use for Clojure (performance)? - clojure

In order to get the most performance, should I use the latest Java, i.e. Java 8 for Clojure 1.6 development?
Will Java 8 improves JVM performance over Java 7?

Java updates generally optimize and improve performance on the most common operations - allocation, synchronization, virtual calls, etc. All of those are entirely applicable to Clojure code.
Anecdotally, Java 8 seems about 10% faster than Java 7 for me on typical Clojure programs.
My general recommendation is to use the newest stable Java version for Clojure applications.

Newer versions of Java generally are faster than old versions. I would expect Java 8 to be at least the same, if not a little faster for Clojure code.
Java 8 is faster than Java 6 and Java 7 in this benchmark: http://www.optaplanner.org/blog/2014/03/20/HowMuchFasterIsJava8.html

Related

Use C/C++ in Apache-Flink

My team and I are developing an application that makes use of Flink.
The data will be processed using a computationally-heavy numerical algorithm.
In order to optimize it as much as possible, I would like to write this algorithm in C/C++ rather than in Java.
The question is: is it possible to use C/C++ code within Flink? Perhaps by wrapping it into a Java library?
I've never tested this case in particular. In general, you can always use native code from Java using JNI, the Java Native Interface.
The idea would be to have a Java facade expose your native code and use these methods in your computation graph defined with Flink in Java (or other JVM languages, like Scala). You would have to make both Java and native libraries available on all involved nodes to make this work. If you have an Hadoop cluster you can leverage YARN to ship files along with your job (docs here, see --yarn-ship CLI option).
I would suggest you test this incrementally, with a very small native function exposed. Also, don't underestimate Java's capabilities in terms of performance: with some well thought programming and leveraging JIT and other runtime optimizations, long running processes can enjoy even better performances than similar native code with unmanaged memory.
Keep in mind that this of course resorting to native code will mean restricting the portability of your code to the platforms for which you'll compile your libraries.

JVM Gotchas, especially for Clojure

I remember I used to work at a company that couldn't run their JVM software on the OpenJDK JVM. They had to use the Oracle JVM. (Full disclosure: they were writing in groovy/grails.)
But I look at a lot of other JVM applications, and they seem to work fine on both JVMs. The OpenJDK JVM seems to be a solid implementation.
Being a Clojure enthusiast, I want to be able to code for both JVMs.
So, specifically:
What are some common "gotchas" which, if you were targeting one JVM, you would have to be careful about when writing for a different JVM?
Are there any language specific pitfalls, especially when it comes to clojure?
When writing a clojure application, is there any common pitfalls in targeting both JVMs?
I don't know of any significant issues between different JDKs for Clojure. We do matrix test builds on several JDK versions and providers - see http://build.clojure.org/job/clojure-test-matrix/ for the current list.

Clojure performance on JVM versus CLR

Are there are performance comparisons of Clojure on JVM versus CLR? Or maybe someone who has used both with performance-sensitive code can give some anecdotal comments?
The performance of Clojure JVM is better than that of Clojure CLR. I don't have explicit benchmarks to point to, but I have a lot of experience doing compilation and running tests in both environments and the difference is obvious.
There are several factors involved in the difference. Some are being worked on. Some are related to JVM vs CLR perf differences and hence beyond the means of the ClojureCLR developers to address.
(1) Compilation of Clojure code to the platform Intermediate Language.
At the most basic level, the IL generated is almost identical. However, design choices forced by some limitations of the Dynamic Language Runtime result in each function definition creating an extra class and function invocations to have an extra method call. Version 1.4 of ClojureCLR (coming soon) eliminates the use of the DLR for most code generation. (The DLR will still be used for CLR interop and polymorphic inline caching.) At this point, generated code will be substantially the same as the JVM version. Startup time has been reduced by 10% and simple benchmarks show 4-16% improvements over version 1.3. More details here.
(2) Startup time
Clojure JVM starts significantly faster than Clojure CLR. Most of this is traceable to the JVM being able to selectively load class files (versus the CLR loading entire assemblies) and differences in when JIT compilation occurs. However, if ClojureCLR is NGEN'd, startup times are really fast. More details here.
(3) JVM versus CLR performance
Some attention has been paid to making ClojureJVM work well with HotSpot compiler optimizations. I don't have explicit proof, but I'm guessing that HotSpot just does a better job on things like inlining in compiled Clojure code versus the CLR JITter. It is fair to say that no attention has been paid to how to make ClojureCLR take better advantage of the CLR JITter.
The release of ClojureCLR 1.4 will provide a good opportunity for some benchmarking.
I've not really used the CLR version so can't fully answer your question.
However it is worth noting that most of the optimisation / development effort so far has gone into the mainline JVM version of Clojure. As a result you can expect the JVM version of Clojure to perform considerably better at present in most situations.
Clojure on the JVM is already one of the fastest dynamically typed languages around - from the benchmarks game page Common Lisp is the only dynamically typed language which is (marginally) faster.
Over time I'd expect the Clojure JVM/CLR gap to narrow as both versions tend towards the performance of their host platforms. But right now, if performance is your key concern, I'd definitely recommend the JVM version (as well as performance, the JVM version is also likely to be better for overall maturity, library availability and cross platform support).

Implementations of Clojure for other platforms?

Are there any implementations of Clojure being built for other virtual machines (such as .Net, Python, Ruby, Lua), or is it too closely tied to Java and the JVM? Does it make sense to build a Clojure for other platforms?
There are currently three implementations of Clojure that I know of:
ClojureCLR, an implementation of Clojure for the CLI,
ClojureScript, an implementation of (a subset of (a variant of)) Clojure for ECMAScript and
a Clojure implementation for the Java platform, confusingly also called Clojure.
In fact, the name Clojure was specifically chosen by Rich Hickey because it contained both the letters CLR as well as the letter J.
I've heard rumours of implementations for the Objective-C/Cocoa runtime, LLVM and the Rubinius VM, but I have no idea whether or not those actually exist.
" or is it too closely tied to Java and the JVM?
Does it make sense to build a Clojure for other platforms?"
One of the Clojure design philosophies is embrace the host platform. Clojure on the JVM embraces the JVM and gives direct access to classes, numbers etc. interop is both ways with out glue.
ClojureScript embraces JavaScript(ECMAScript) in exactly the same way, giving direct access to Objects, numbers, etc. the same for the .NET target.
It is tempting, but not always successful, to make 'cross platform' languages that run the exact same source code on multiple platforms. Thus far Clojure has avoided this temptation and strives to remain close to the host.
There exits at least a ClojureCLR project by Rich Hickey himself.
This project is a native implementation of Clojure on the Common Language Runtime (CLR),
the execution engine of Microsoft's .Net Framework.
ClojureCLR is programmed in C# (and Clojure itself) and makes use of Microsoft's
Dynamic Language Runtime (DLR).
I'm not sure that Python and Ruby ports make sense, those are languages with multiple virtual machines / implementations. If you want to have native interop between Clojure and Python or Ruby you could use Jython or JRuby and stay on the JVM.

What is a good scripting language to integrate into high-performance applications?

I'm a game's developer and am currently in the processing of writing a cross-platform, multi-threaded engine for our company. Arguably, one of the most powerful tools in a game engine is its scripting system, hence I'm on the hunt for a new scripting language to integrate into our engine (currently using a relatively basic in-house engine).
Key features for the desired scripting system (in order of importance) are:
Performance - MUST be fast to call & update scripts
Cross platform - Needs to be relatively easy to port to multiple platforms (don't mind a bit of work, but should only take a few days to port to each platform)
Offline compilation - Being able to pre-parse the script code offline is almost essential (helps with file sizes and load times)
Ability to integrate well with c++ - Should be able to support OO code within the language, and integrate this functionality with c++
Multi-threaded - not required, but desired. Would be best to be able to run separate instances of it on multiple threads that don't interfere with each other (i.e. no globals within the underlying code that need to be altered while running). Critical Section and Mutex based solutions need not apply.
I've so far had experience integrating/using Lua, Squirrel (OO language, based on Lua) and have written an ActionScript 2 virtual machine.
So, what scripting system do you recommend that fits the above criteria? (And if possible, could you also post or link to any comparisons to other scripting languages that you may have)
Thanks,
Grant
Lua has the advantage of being time-tested by a number of big-name video game developers and a good base of knowledgeable developers thanks to Blizzard-Activision's adoption of it as the primary platform for developing World of Warcraft add-ins.
Lua is a very good match for your needs. I'll take them in the same order.
Lua is one of the fastest scripting languages. It's fast to compile and fast to run.
Lua compiles on any platform with an ANSI C compiler, which afaik includes all gaming platforms.
Lua can be pre-compiled, but as a very dynamic languages most errors are only detectable at runtime. Also precompiled code (as bytecode) is often larger in terms of size than source code.
There are many Lua/C++ binding tools.
It doesn't support multi-threading (you cannot access a single instance of the interpreter from multiple threads), but you can have several instances of the interpreter, one per thread, or even one per game object.
Lua have been used in video-game industry for years. Lightweight and efficient.
That being said, ChaiScript and Falcon are good candidates matching your needs and with higher level language than Lua but with less history and community support.
Lua
Boost Python
SWIG
We've had good luck with Squirrel so far. Lua is so popular it's on its way to becoming a standard.
I recommend you worry more about memory than speed. Most scripting languages are "fast enough" and if they get slow you can always push some of that functionality back down into C++. Many of them burn through lots of memory, though, and on a console memory is an even more scarce resource than CPU time. Unbounded memory consumption will crash you eventually, and if you have to allocate 4MB just for the interpreter, that's like having to throw 30 textures out the window to make room.
Lua, and then LuaJIT for extra blaziness!
just don't expect too much from automatic C++ binding libraries, most are slow and restrictive. better do your own binding for your own objects.
as for concurrency, either LuaLanes, or roll your own. if your C++ program is already multithreaded, just call separate LuaStates from each thread, and use your own C++ shared structures as communications channels if needed.
as you might already know, the most often repeated answer in Lua is 'roll your own', and it's often the best advice! except when it's about bindings to common C/C++ libraries, in that case it's quite probable there's already one.
If you haven't looked at it yet I would suggest you check out Angelscript.
I have successfully used it in a cross platform environment (Windows and Linux with only a recompile) and it is designed to integrate well with C++ (both objects and code).
It is lightweight and supports multi-threading (in the sense that the question was asked), performs well and compiles to byte code which could be done in advance.
Start with Python.
If you can prove that you need more speed, then look at Stackless Python. That's what EVE Online uses for their game.
JavaScript may be a reasonable option, because of the mountains of effort that have gone into optimizing the various implementations for use in web-browsers.
These come to mind:
Lua
Python with boost::python
MzScheme or Guile
Ruby with SWIG