Map-reduce framework in D lang - mapreduce

looking for any map-reduce framework (even the smallest one) written in D.
Is there anything?
Thank you.

For basic map reduce funcionality you can use phobos library.
For non-parallel task use std.algorithm http://dlang.org/phobos/std_algorithm.html#map and
http://dlang.org/phobos/std_algorithm.html#reduce
For parallel use std.parallelism: http://dlang.org/phobos/std_parallelism.html#.TaskPool.map and http://dlang.org/phobos/std_parallelism.html#.TaskPool.reduce

There is MapReduce-MPI. It is written in C++ but callable from C, which means callable from D.
keep in mind though there is no fault tolerance because MPI doesn't have fault tolerance.

Related

Is it feasible to have an algorithm in C++ which calls a Fortran program for the heavily computation parts?

I am developing an algorithm that has a large numerical computation part. My project supervisor recommended me to use Fortran because of this and so for the last weeks I've been work on it (so far so good). It would be a new version of an algorithm of his, which is basically a lot of numerical computing.
Mine, however, would have more "logic" to it. Without going into much detail, the brute force approach is done using just fortran because it's just 95% of reading from a file and doing the operations. However, the aim of the project is to provide an efficient algorithm to do this, I had been thinking about methods and wanted to start with a Greedy approach (something like Hill Climbing) and that got me thinking that for this part in particular, maybe it would be better to write the algorithm in C++ instead of Fortran.
So basically, how hard do you think it would be to develop the algorithm "logic" in C++ and then call Fortran whenever the bulk of the numerical computation has to be performed. Would it be worth it? Or should I just stick with one of the two languages?
Sorry if it is a very ignorant question but I can't get an idea of whether writing an algorithm such as Hill Climbing would be more difficult if done with Fortran instead of C++ and the benefits of Fortran in this case would be worth it.
Thanks for your time and have a nice day!

C++ or Python for an Extensive Math Program?

I'm debating whether to use C++ or Python for a largely math-based program.
Both have great math libraries, but which language is generally faster for complex math?
You could also consider a hybrid approach. Python is generally easier and faster to develop in, specially for things like user interface, input/output etc.
C++ should certainly be faster for some math operations (although if your problem can be formulated in terms of vector operations or linear algebra than numpy provides a python interface to very efficient vector manipulations).
Python is easy to extend with Cython, Swig, Boost Python etc. so one strategy is write all the bookkeeping type parts of the program in Python and just do the computational code in C++.
I guess it is safe to say that C++ is faster. Simply because it is a compiled language which means that only your code is running, not an interpreter as with python.
It is possible to write very fast code with python and very slow code with C++ though. So you have to program wisely in any language!
Another advantage is that C++ is type safe, which will help you to program what you actually want.
A disadvantage in some situations is that C++ is type safe, which will result in a design overhead. You have to think (maybe long and hard) about function and class interfaces, for instance.
I like python for many reasons. So don't understand this a plea against python.
It all depends if faster is "faster to execute" or "faster to develop". Overall, python will be quicker for development, c++ faster for execution. For working with integers (arithmetic), it has full precision integers, it has a lot of external tools (numpy, pylab...) My advice would be go python first, if you have performance issue, then switch to cpp (or use external libraries written in cpp from python, in an hybrid approach)
There is no good answer, it all depends on what you want to do in terms of research / calculus
It goes witout saying that C++ is going to be faster for intensive numeric computations. However, there are so many pre-existing libraries out there (written in C/C++/Haskell etc..), with Python wrappers - it's just more convenient to utilise the convenience of Python and let the existing libraries carry the load.
One comprehensive system is http://www.sagemath.org and a fairly interesting link is the components it uses at http://sagemath.org/links-components.html.
A system with numpy/scipy and pandas from my experience is normally sufficient for most things.
Use the one you like better (and you should like python better :)).
In either case, any math-intensive computations should be carried out by existing libraries - which aren't language dependent (usually BLAS / LAPACK are used to perform the actual math).
If you choose to go with python, use numpy
for calculations.
Edit: From your comments, it seems that you are very concerned with the speed of your program. The only way to know for sure how much time is wasted by the high level pythonic code is to profile your program (for example, use ipython with run -p).
In most cases, you will find that the high level stuff takes about 10% of the total running time, and therefore switching from python to C++ will only improve that 10% by some factor, for a total gain of perhaps 5% in running time.
I sincerely doubt that Google and Stanford don't know C++.
"Generally faster" is more than just language. Algorithms can make or break a solution, regardless of what language it's written in. A poor choice written in C++ and be beaten by Java or Python if either makes a better algorithm choice.
For example, an in-memory, single CPU linear algebra library will have its doors blown in by a parallelized version done properly.
An implicit algorithm may actually be slower than an explicit one, despite time step stability restrictions, because the latter doesn't have to invert a matrix. This is often true for hyperbolic partial differential equations.
You shouldn't care about "generally faster". You ought to look deeply into the problem you're trying to solve and the algorithms used to solve it. You'll do better that way than a blind language choice.
I would go with Python running on Java platform. This approach is implemented in DataMelt program. Algorithm that call Java libraries from Python can be faster, since JVM optimizes the code for you.

evaluating multiple integrals

Is there any library to evaluate multidimensional integrals? I have at least 4 (in general much more than that), where the integrand is a combination of variables, so I cannot separate them. Do you know of any library for numerical evaluation? I'm especially looking for either matlab or c++, but I will use anything that will do the work.
Since you don't specify the kind of integrals or the actual dimensionality, I can only suggest that you take into account that
where the function F(x) is defined as
and use this fact to compute your integrals with the usual quadrature techniques. For example, you could use trapz or quad in MATLAB. However, if the dimensionality is truly high, then you are better off using Monte Carlo algorithms.
First link off google.
Seems pretty roboust.
"Numerical Recipes In C" has a very nice chapter on numerical integration.
Maybe Gaussian quadrature can help you out.
Yes there is TESTPACK which is C++ program which demonstrates the testing of a routine for multidimensional integration.

How to choose an integer linear programming solver?

I am newbie for integer linear programming.
I plan to use a integer linear programming solver to solve my combinatorial optimization problem.
I am more familiar with C++/object oriented programming on an IDE.
Now I am using NetBeans with Cygwin to write my applications most of time.
May I ask if there is an easy use ILP solver for me?
Or it depends on the problem I want to solve ? I am trying to do some resources mapping optimization. Please let me know if any further information is required.
Thank you very much, Cassie.
If what you want is linear mixed integer programming, then I would point to Coin-OR (and specifically to the module CBC). It's Free software (as speech)
You can either use it with a specific language, or use C++.
Use C++ if you data requires lots of preprocessing, or if you want to put your hands into the solver (choosing pivot points, column generation, adding cuts and so on...).
Use the integrated language if you want to use the solver as a black box (you're just interested in the result and the problem is easy or classic enough to be solved without tweaking).
But in the tags you mention genetic algorithms and graphs algorithms. Maybe you should start by better defing your problem...
For graphs I like a lot Boost::Graph
I have used lp_solve ( http://lpsolve.sourceforge.net/5.5/ ) on a couple of occasions with success. It is mature, feature rich and is extremely well documented with lots of good advice if your linear programming skills are rusty. The integer linear programming is not a just an add on but is strongly emphasized with this package.
Just noticed that you say you are a 'newbie' at this. Well, then I strongly recommend this package since the documentation is full of examples and gentle tutorials. Other packages I have tried tend to assume a lot of the user.
For large problems, you might look at AMPL, which is an optimization interpreter with many backend solvers available. It runs as a separate process; C++ would be used to write out the input data.
Then you could try various state-of-the-art solvers.
Look into GLPK. Comes with a few examples, and works with a subset of AMPL, although IMHO works best when you stick to C/C++ for model setup. Copes with pretty big models too.
Linear Programming from Wikipedia covers a few different algorithms that you could do some digging into to see which may work best for you. Does that help or were you wanting something more specific?

Open source C++ library for vector mathematics

I would need some basic vector mathematics constructs in an application. Dot product, cross product. Finding the intersection of lines, that kind of stuff.
I can do this by myself (in fact, have already) but isn't there a "standard" to use so bugs and possible optimizations would not be on me?
Boost does not have it. Their mathematics part is about statistical functions, as far as I was able to see.
Addendum:
Boost 1.37 indeed seems to have this. They also gracefully introduce a number of other solutions at the field, and why they still went and did their own. I like that.
Re-check that ol'good friend of C++ programmers called Boost. It has a linear algebra package that may well suits your needs.
I've not tested it, but the C++ eigen library is becoming increasingly more popular these days. According to them, they are on par with the fastest libraries around there and their API looks quite neat to me.
Armadillo
Armadillo employs a delayed evaluation
approach to combine several operations
into one and reduce (or eliminate) the
need for temporaries. Where
applicable, the order of operations is
optimised. Delayed evaluation and
optimisation are achieved through
recursive templates and template
meta-programming.
While chained operations such as
addition, subtraction and
multiplication (matrix and
element-wise) are the primary targets
for speed-up opportunities, other
operations, such as manipulation of
submatrices, can also be optimised.
Care was taken to maintain efficiency
for both "small" and "big" matrices.
I would stay away from using NRC code for anything other than learning the concepts.
I think what you are looking for is Blitz++
Check www.netlib.org, which is maintained by Oak Ridge National Lab and the University of Tennessee. You can search for numerical packages there. There's also Numerical Recipes in C++, which has code that goes with it, but the C++ version of the book is somewhat expensive and I've heard the code described as "terrible." The C and FORTRAN versions are free, and the associated code is quite good.
There is a nice Vector library for 3d graphics in the prophecy SDK:
Check out http://www.twilight3d.com/downloads.html
For linear algebra: try JAMA/TNT . That would cover dot products. (+matrix factoring and other stuff) As far as vector cross products (really valid only for 3D, otherwise I think you get into tensors), I'm not sure.
For an extremely lightweight (single .h file) library, check out CImg. It's geared towards image processing, but has no problem handling vectors.