C++ decision tree with pruning - c++

Can you recommend me a good decision tree C++ class with support for continous features and pruning(its very important)? Im writing a simple classifier(two classes) using 9 features. I've been using Waffles recently, but looks like tree is overfitting so i get Precision around 82% but Recall is around 51% which is inacceptable. Waffles have no ability to prune decision trees, and im running out of time :)

Have to answer my own question, as long as no one answered.
I used decision tree implementation from OpenCV library, very flexible implementation and fast enough for my tasks.

Related

Is it feasible to have an algorithm in C++ which calls a Fortran program for the heavily computation parts?

I am developing an algorithm that has a large numerical computation part. My project supervisor recommended me to use Fortran because of this and so for the last weeks I've been work on it (so far so good). It would be a new version of an algorithm of his, which is basically a lot of numerical computing.
Mine, however, would have more "logic" to it. Without going into much detail, the brute force approach is done using just fortran because it's just 95% of reading from a file and doing the operations. However, the aim of the project is to provide an efficient algorithm to do this, I had been thinking about methods and wanted to start with a Greedy approach (something like Hill Climbing) and that got me thinking that for this part in particular, maybe it would be better to write the algorithm in C++ instead of Fortran.
So basically, how hard do you think it would be to develop the algorithm "logic" in C++ and then call Fortran whenever the bulk of the numerical computation has to be performed. Would it be worth it? Or should I just stick with one of the two languages?
Sorry if it is a very ignorant question but I can't get an idea of whether writing an algorithm such as Hill Climbing would be more difficult if done with Fortran instead of C++ and the benefits of Fortran in this case would be worth it.
Thanks for your time and have a nice day!

Job scheduling to minimise loss

I have got a job scheduling problem. We are given start time, time to
complete the order, deadline.
It is given that start time + time to
complete <= deadline.
I have also been given the loss that will occur if I am not able to
complete the job before the deadline. I have to design an algorithm to minimize the loss.
I have tried changing the standard algorithm of dynamic programming for maximizing the profit in job scheduling but to no success.
What algorithm can I use to solve the question?
Dynamic Programming isn't the right approach based on what you're aiming to optimize. You can find the optimized schedule by using a greedy approach.
Here's a thorough guide with sample code for your desire language (C++), in the guide it assumes each jobs takes only 1 unit of time, which you can easily modify by using time_to_complete instead.
Your problem is similar to the knapsack one. Using a greedy approach is convenient if you aren't actually looking for the best possible solution, but just a "good enough" one.
The big pro of the greedy approach is that the cost is rather lower than other "more thorough" approaches but, if you need the best solution to your problem, I would say that backtracking is the way to go.
Since the deadline can be violated, the problems looks like a Total Weighted Tardiness Scheduling Problem. There are many flavors of it, but most problems under this umbrella are computationally hard, therefore Dynamic Programming (DP) would not be my first choice. In my experience, DP also poses difficulties during modeling and implementation. Same comment for mathematical programming "as-is". Some approaches that can be implemented more quickly are:
constraint programming: very small learning curve, and there are many libraries out there, included very good open source ones (most have C++ API). Bonus: constraint programming can demonstrate optimality.
ad hoc heuristics: (1) start with a constructive algorithm (like the greedy approach suggested by Ling Zhong and Flavio Giobergia), then (2) use some local search approach to improve if and finally (3) embed the approach into a metaheuristic scheme. This way you can build on top of the previous step, and learn a lot about the problem. Note: in general, heuristics cannot demonstrate optimality
special mention: local solver, a hybrid approach between the two above: it lets you model the problem using a formalism similar to constraint programming and then it solves it using heuristics. It is very easy to learn, it usually lets you get started quickly and, in my tests, it provides remarkably good results.

Gomoku state-of-the-art tech

usually people use pn-search or pn^2 Or df-pn to answer if there is a win solution.
then they use alpha-beta pruning on the min-max game tree with a good evaluation function
they can reach a depth of 15 ply or even more
now there is a Monte Carlo method which is successful in dealing with Go.
Is the same tech can be used in Gomoku ? any examples (source code or paper)
Is there any paper describe a good way to build a well tuned evaluation function.
or Is there any other state-of-the-art or useful tech to deal with Gomoku ?
Is pn search necessary in dealing with Gomoku?
Is there any different VCT engine (src better) ?
To the best of my knowledge, proof number search, dependency based search (also referred as threat space search), and searching algorithms based on alpha-beta framework are mainly used in top Gomoku programs. There also exist some Gomoku programs using Monte Carlo Tree Search, however, the current result is not that good. The article on http://www.aiexp.info/gomoku-renju-resources-an-overview.html summarizes the reading materials, protocols and source code for Gomoku AI.
As for evaluation function, up to now, although there are some papers describing how to build a well tuned evaluation function for Gomoku, none of them really works to achieve the state of the art.
Pn-search is not necessary in dealing with Gomoku. In fact, the state-of-the-art Gomoku engine Yixin does not use pn-search.
Renjusolver is the best VCT engine. Except for renjusolver, there are many other Gomoku engines which have relative good performance on solving VCT and can be downloaded at http://gomocup.org/download/. Currently, pela is the best open source engine on solving VCT.

Bullet vs Newton Game Dynamics vs ODE physics engines [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am trying to pick a physics engine for a simple software application. It would be to simulate a rather small number of objects so performance isn't a huge concern. I am mostly concerned with the accuracy of the motion involved. I would also like the engine to be cross-platform between windows/linux/mac and usable with c++ code. I was looking at Bullet, Newton Game Dynamics, and ODE because they are open source. However, if Havok/PhysX are significantly more accurate I would consider those too.
All I seem to find are opinions on the engines, are there any thorough comparisons between the options? Or does anyone have experience trying the various engines out. Since what I'm trying to do is relatively simple there probably isn't a huge difference between them, but I'd like to hear what people have to say about the options? Thanks!
There is a nice comparison of ODE and Bullet here:
http://blog.wolfire.com/2010/03/Comparing-ODE-and-Bullet
Hope it can be useful in making a choice.
Although it is a bit dated, there is a comprehensive comparison of (in alphabetical order) Bullet, JigLib, Newton, ODE, PhysX, and others available here:
http://www.adrianboeing.com/pal/papers/p281-boeing.pdf
The comparison considers integrators, friction models, constraint solvers, collision detection, stacking, and computational performance.
Sorry, but you will never find a real comparison with respect to accuracy. I am searching for three months now for my master thesis and have not found it. So I started to do the comparison on my own but it's still a long way to go. I'm testing with 3d engines and even 2d engines and for now Chipmunk is the one with the highest accuracy so far. So if you have no need for 3d I would reccomend it. However if you have an urgent need for 3d and your problem is as simple as you described it (don't want to expand it in the future?) Bullet and ODE will do it. I would prefer Bullet because it is much more up-to-date and is still actively maintained. At least there is Newton, with which I am fighting right now. Therefore I can't give you pros and cons except that it is a bit more work to get familiar with the (crucial) bad documentation.
Hope that helps. Best regards.
One thing I found really valuable in ODE is the ability to change pretty much every single parameter 'on the fly'. As an example, the engine doesn't seem to complain if you modify the inertia or even the shape of a body. You could replace a sphere with a box and everything would just keep working, or change the size of the sphere.
Other engines are not as flexible usually, because they do a lot of work internally for optimization purposes.
As for accuracy, as far as I know, ODE still supports a very accurate (but slow) solver which is obviously not very popular in the games industry because you can't play around with more than 25-30 objects in real time. Hope this helps.
Check out Simbody, which is used in engineering. It's particularly good for simulating articulated bodies. It has been used for more than 5 years to simulate human musculoskeletal dynamics. It's also one of the physics engines used in Gazebo, a robot simulation environment.
https://github.com/simbody/simbody
http://nmbl.stanford.edu/publications/pdf/Sherm2011.pdf
A physics abstraction layer supports a large number of physics engines via a unified API, making it easy to compare engines for your situation.
PAL provides a unique interface for these physics engines:
Box2D (experimental)
Bullet
Dynamechs(deprecated)
Havok (experimental)
IBDS (experimental)
JigLib
Meqon(deprecated)
Newton
ODE
OpenTissue (experimental)
PhysX (a.k.a Novodex, Ageia PhysX, nVidia PhysX)
Simple Physics Engine (experimental)
Tokamak
TrueAxis
According to the December 2007 paper linked in this answer:
Of the open source engines the Bullet engine provided the best results
overall, outperforming even some of the commercial engines. Tokamak
was the most computationally efficient, making it a good choice for
game development, however TrueAxis and Newton performed well at low
update rates. For simulation systems the most important property of
the simulation should be determined in order to select the best
engine.
Here is a September 2007 demo by the same author:
https://www.youtube.com/watch?v=IhOKGBd-7iw

How to choose an integer linear programming solver?

I am newbie for integer linear programming.
I plan to use a integer linear programming solver to solve my combinatorial optimization problem.
I am more familiar with C++/object oriented programming on an IDE.
Now I am using NetBeans with Cygwin to write my applications most of time.
May I ask if there is an easy use ILP solver for me?
Or it depends on the problem I want to solve ? I am trying to do some resources mapping optimization. Please let me know if any further information is required.
Thank you very much, Cassie.
If what you want is linear mixed integer programming, then I would point to Coin-OR (and specifically to the module CBC). It's Free software (as speech)
You can either use it with a specific language, or use C++.
Use C++ if you data requires lots of preprocessing, or if you want to put your hands into the solver (choosing pivot points, column generation, adding cuts and so on...).
Use the integrated language if you want to use the solver as a black box (you're just interested in the result and the problem is easy or classic enough to be solved without tweaking).
But in the tags you mention genetic algorithms and graphs algorithms. Maybe you should start by better defing your problem...
For graphs I like a lot Boost::Graph
I have used lp_solve ( http://lpsolve.sourceforge.net/5.5/ ) on a couple of occasions with success. It is mature, feature rich and is extremely well documented with lots of good advice if your linear programming skills are rusty. The integer linear programming is not a just an add on but is strongly emphasized with this package.
Just noticed that you say you are a 'newbie' at this. Well, then I strongly recommend this package since the documentation is full of examples and gentle tutorials. Other packages I have tried tend to assume a lot of the user.
For large problems, you might look at AMPL, which is an optimization interpreter with many backend solvers available. It runs as a separate process; C++ would be used to write out the input data.
Then you could try various state-of-the-art solvers.
Look into GLPK. Comes with a few examples, and works with a subset of AMPL, although IMHO works best when you stick to C/C++ for model setup. Copes with pretty big models too.
Linear Programming from Wikipedia covers a few different algorithms that you could do some digging into to see which may work best for you. Does that help or were you wanting something more specific?