Best way to use Google protocol buffer(protobuf) with Matlab - c++

I am trying to import some protobuf binaries in Matlab. I see 2 ways to do it
1) Use protobuf Matlab plugin
2) Use C++ APIs provided by Google and then import data into matlab using mex files.
Since I am working with large scale data, Which one would be faster to run?

Faster to run is likely to be C++, of those two.
I can tell you definitely that coming up to the Matlab-language level for byte, two-byte, four-byte reads or buffer accesses runs a lot slower than you might expect.
And looking in protobuf-matlab/source/browse/protobuflib, I see tons of that type of operation being done in Matlab code.
You didn't list what I would think to be the best of both worlds: the Java API.
If you have any Java code in your project, or a modest level of comfort with Java, I'd strongly consider using the Java API. Matlab has really good interoperability with Java; it has the big advantage that you can work through examples interactively with methodsview(), etc. There is some learning curve with javaObject()/javaMethod(), vs. when one can use a more native syntax.

Related

Use C/C++ in Apache-Flink

My team and I are developing an application that makes use of Flink.
The data will be processed using a computationally-heavy numerical algorithm.
In order to optimize it as much as possible, I would like to write this algorithm in C/C++ rather than in Java.
The question is: is it possible to use C/C++ code within Flink? Perhaps by wrapping it into a Java library?
I've never tested this case in particular. In general, you can always use native code from Java using JNI, the Java Native Interface.
The idea would be to have a Java facade expose your native code and use these methods in your computation graph defined with Flink in Java (or other JVM languages, like Scala). You would have to make both Java and native libraries available on all involved nodes to make this work. If you have an Hadoop cluster you can leverage YARN to ship files along with your job (docs here, see --yarn-ship CLI option).
I would suggest you test this incrementally, with a very small native function exposed. Also, don't underestimate Java's capabilities in terms of performance: with some well thought programming and leveraging JIT and other runtime optimizations, long running processes can enjoy even better performances than similar native code with unmanaged memory.
Keep in mind that this of course resorting to native code will mean restricting the portability of your code to the platforms for which you'll compile your libraries.

Use Matlab data structures in C++?

I am currently working on a project in C++, and I am actually interested in using Matlab data structures, instead of having to create my own data types (such as matrices, arrays, etc.)
Is there a way to seamlessly use Matlab objects in C++? I don't mind having to run Matlab in the background while my program runs.
EDIT: A starting point is this: http://www.mathworks.co.uk/help/matlab/calling-matlab-engine-from-c-c-and-fortran-programs.html. I will continue reading this.
You can use instead Armadillo C++ maths library; used by NASA, Boeing, Siemens, Deutsche Bank, MIT, CMU, Stanford, etc.
They have good documentation and examples if you are more familiar with MATLAB/OCTAVE
http://arma.sourceforge.net/docs.html#syntax
I would prefer using native C++ library of some sort and not Matlab. This is likely to be faster for both development and execution.
From writing C++ extensions for Matlab I learned one thing: Using Matlab objects in C++ is likely to give you considerable headache.
Matlab data structures are not exposed as C++ classes. Instead, you get pointers that you can manipulate with C-like API functions.
I recommend to use a native C++ library such as Eigen3.
The functionality you are looking at is not really intended to be used as seamless objects. In the past when I have used it I found it much simpler to do the C parts using either native arrays or a third party matrix library and then convert it into a Matlab matrix to return.
Mixing Matlab and C++ is typically done in one of two ways:
Having a C++ program call Matlab to do some specialist processing. This is chiefly useful for rapid development of complex matrix algorithms. You can do this either by calling the full Matlab engine, or by packaging you snippet of Matlab code into a shared library for distribution. (The distributed version packages a distributable copy of the Matlab runtime which is called with your scripts).
Having a Matlab script call a C++ function to do some specialist processing. This is often used to embed C++ implementations of algorithms (such as machine learning models) or to handle specific optimizations.
Both of these use cases have some overhead transferring the data to/from Matlab.
If you are simply looking for some matrix code to use in C++ you would be better off looking into the various C++ matrix libraries, such as the one implemented in Boost.
You can do mixed programming with C++ and Matlab. There are two possible ways:
Call MATLAB Engine directly: Refer to this post for more info. Matlab will run in the background.
Distribute MATLAB into independent shared library: check out here on how to do this (with detail steps and example).

Deciding on a language/framework for a modular OpenCV application

What's this about?
We have a C++ application dealing with image processing and computer vision on videos using OpenCV, we're going to rewrite it from scratch and need some help deciding what technologies to use. More specifically, I need help on how to choose the technology I'd use.
About the app
The app's functionality is divided in modules that are called in an order defined by a configuration XML file and can also be changed in runtime, but not in realtime (i.e. the application doesn't need to close, but the processing will start from scratch). These modules share data in a central datapool.
Why are we starting from scratch?
This application wasn't planned to be as dynamic as it currently strives to be, so it's grown to be a collection of buggy patches, macros and workarounds; it's now full of memory leaks, unnecessary QT dependencies, slow conversions between QT and OpenCV image formats and compilation and testing times have grown too much.
Language choice
The original code used C++, just because the guy who originally started the project only knew C++. This may be a good choice, because we need it to be as fast as possible, but there may be better choices to account for the dynamic nature of the application.
We're limited by the languages supported by OpenCV (C++, Java and Python mainly; although I've read there is also support for Ruby, Ch, C# and any JVM language)
What is needed
Speed: We're aiming for realtime tracking. This may rule out Python and Ruby.
Class Instantiation by name: Although our C++ macros and class registration system work, a language designed to be dynamic that has it's own runtime would be nice. Maybe Objective-C++, or Java.
What would be ideal
Module/Plugin/Extension/Component Framework: Why reinvent the wheel, using a good framework for this would let us focus on what's special about our app. There are many options here. Objective-C has it's NSBundles; C++ has libraries like Boost.Extension, Pluma, DynObj, FxEngine, etc; C has C-Pluff; I'd even say there are too many options.
Runtime class loading and reloading: From a developing point of view, it would be interesting to be able to compile and reload just one module. I've seen this done in via code injection in Objective-C and using Java's reflection.
What am I missing?
I have too many interesting options!
Here's where I need help, based on your experiences in modular app development, with this constraints, what kind of language/framework feature should I be looking for?
What question should I make myself about this project that would let me narrow my search?
Edit
I hadn't noticed that OpenCV had GPU bindings only for C++, so I'm stuck with it.
Now that the language is fixed, the search has narrowed a lot. I could use Objective-C++ to get the dynamism needed (Obj-C runtime + NSBundle from Cocoa/GnuStep/Cocotron), which sounds complicated; or C++ with a framework.
So I'll now narrow my question to:
Is using NSBundle in a crossplatform way with Objective-C++ easier than it sounds?
What C++ framework will provide me with hot-swappable modules?
The main reason for swapping modules in runtime is to be able to change code in a fast way, would Runtime-Compiled C++ be a better solution?
Meta: I did my research on how to ask a question like this, I hope it's acceptable.
"What question should I make myself about this project that would let me narrow my search?"
if you need gpu support(cuda/ocl), your only choice is c++.
you can safely discard C, as it won't be supported in the near future
have no fear of python, even if you need direct pixel access, that's all numpy arrays (running c-code again)
i'd be a bit sceptical of ruby, c# ch and the like, since those bindings are community based, and might not be up to date / maintained properly, while the java & python bindings are machine - generated from the c++ api, and are part of the official distribution.
If you're looking for portability and have large memory for disposal then you can go with Java.
The performance hit between C++ and Java is not that bad. For conversion between Mat and other image format I'm still not sure, coz it needs deep copy to perform that, so if your code can display the image in openCV native format then you can fasten the application
pro :
You can stop worrying about memory leak
The project is much more portable compared to C/C++(this can be wrong if you can avoid using primitive datatypes which size is non consistent and for example always use int*_t in C)
cons:
slower than C/C++
more memory and CPU clock needed
http://www.ibm.com/developerworks/java/library/j-jtp09275/index.html

How interoperability works

I know that many large-scale applications such as video games are created using multiple langages. For example, it's likely the game/physics engines are written in C++ while gameplay tasks, GUI are written in something like Python or Lua.
I understand why this division of roles is done; use lower-level languages for tasks that require extreme optimization, tweaking, efficiency and speed, while using higher-level languages to speed up production time, reduce nasty bugs ect.
Recently, I've decided to undertake a larger personal project and would like to divy-up parts of the project similar to above. At this point in time, I'm really confused about how this interoperability between languages (especially compiled vs interpreted) works.
I'm quite familiar with the details of going from ANSCII code test to loading an executable, when written in something like C/C++. I'm very curious at how something like a video game, built from many different languages, works. This is a large/broad question, but specifically I'm interested in:
How does the code-level logic work? I.e. how can I call Python code from a C++ program? Especially since they don't support the same built-in types?
What does the program image look like? From what I can tell, a video game is running in a single process, so what does the runtime image look like when running a C/C++ program that calls a Python function?
If calling code from an interpreted language from a compiled program, what are the sequence of events that occur? I.e If I'm inside my compiled executable, and for some reason have a call to an interpreted language inside a loop, do I have to wait for the interpreter every iteration?
I'm actually finding a hard time finding information on what happening at the machine-level, so any help would be appreciated. Although I'm curious in general about interoperation of software, I'm specifically interested in C++ and Python interaction.
Thank you very much for any insight, even if it's just pointing me to where I can find more information.
In the specific case of python, you have basically three options (and this generally applies across the board):
Host python in C++: From the perspective of the C++ programme, the python interpreter is a C library. On the python side, you may or may not need to use something like ctypes to expose the C(++) api.
Python uses C++ code as DLLs/SOs - C++ code likely knows nothing of python, python definitely has to use a foreign function interface.
Interprocess communication - basically, two separate processes run, and they talk over a socket. These days you'd likely use some kind of web services architecture to accomplish this.
Depending on what you want to do:
Have a look at SWIG: http://www.swig.org/ It's a tool that aims to connect C/C++ code with Python, Tcl, Perl, Ruby, etc. The common use case is a Python interface (graphical or not) that will call the C/C++ code. SWIG will parse the C/C++ code in order to generate the interfaces.
Libpython: it's a lib that allows you to embed Python code. You have some examples here: http://docs.python.org/3.0/extending/embedding.html

Call MATLAB APIs in C/C++

I just heard from somewhere that for numerical computation, "MATLAB does offer some user-friendly APIs. If you call these APIs in your C/C++ code, you can speed up computation dramatically."
But I did not find such information in MATLAB documents like http://www.mathworks.com/support/tech-notes/1600/1622.html and http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/bp_kqh7.html. All I learned from these websites is that MATLAB can be called in C and C++ by Matlab engine or by compiling M-files into libraries by mcc. They don't mention any built-in numerical MATLAB APIs that can be called in C/C++.
Can someone please clarify?
Thanks and regards!
You want the "Engine" routines. This allows you to start up a background MATLAB process from C and execute calculations on it: relevant MATLAB documentation.
It works pretty well, have a look at the examples. I would say the most annoying thing getting it working is marshaling the data between C and MATLAB. But that's always a problem when doing this kind of thing.
It sounds like you're looking for the code generation tools in the embedded matlab toolbox or real time workshop.
Do doc eml and look for a the LMS (least mean square) equalizer demo.
The code generator is quite good, it will give you a make file that will build a static library. It's easy to use with your stand alone C/C++ code.
There could be a few things that quote is referencing, I assume that it is referring to the MATLAB Compiler. So going from MATLAB -> C++ you can use the compiler to build standalone "faster" applications. However, when speed testing the improvement, I've noticed it to be negligible. Honestly, you're probably far better off coding your work in C from the get-go, the code that the compiler generates is spaghetti and non-object oriented. I should also mention that this is an expensive extension to Matlab.
You can use the MCR in your own c++ project as a stand-alone library (details)... but you might get similar results using Numerical Recipes.
Disclaimer: I used this product 2-3 years ago, things could be different now.