How fast is simplex method to solve tsp? - linear-programming

How fast is simplex method compared with brute-force or any other algorithm to solve a ts problem?

You can't model a TS problem with a "pure" LP problem (with continuous variables). You can use an integer-programming formulation, wich will use the simplex method at each node of a research tree (branch and bound or branch and cut method). It will work for small problems, but it is slow because the problem is hard: with one binary variable for each edge for instance, you need a lot of constraints to model the fact that the path is a cycle.
Brute-force is intractable (the problem is exponential), do not even try it unless you have a very small problem. Use the MIP formulation, even for small problems.
For big problems, you should use some kind of heuristic (I think simulated annealing give good results on this one), or a "smart" modelization of you problem (column generation for instance) if you want an exact solution.

Related

GLPK Simplex behavior with meth=GLP_PRIMAL vs meth=GLP_DUALP

I have a linear program with no objective function. So I just want to test its feasibility. I am using GLPK api for simplex to do that. When I run simplex with the default method (meth=GLP_PRIMAL), the solver fails to converge in 100000 iterations (that is the limit I have set). However, when I use the method GLP_DUALP, after a few iterations I get the message "Warning: dual degeneracy; switching to primal simplex" and it goes on to converge in a reasonable number of iterations.
So my question is if it ultimately uses the primal simplex in both cases, why does it not converge in the first case. What might be going on.
Thanks in advance.
It's hard to say what exactly is happening without detailed information about the problem, but basically the primal simplex in the case of dual degeneracy is some sort of "warm-started".
While using the dual algorithm the optimization process is generating a dual problem, which will start with an optimal solution and then tries to find a feasible solution which still contains the optimal condition. Primal simplex instead will start the other way around with a feasible solution and then search for an optimal solution.
When the strong duality theorem is true both optimal solution should be the same. In your Problem you get a "dual degeneracy" warning, which means that in the dual problem there is a equation which turns to be 0. So the variable inside this equation has no influence to the objective function (no matter if X is 100 or just 1), this is plausible since you dont have an objective function. GLPK is then switching to primal simplex, because for the dual problem there exists alternative optimal solutions. With the already derived informations the primal simplex could be faster. I don't know what GLPK is exactly doing, but normally one can use the feasible solutions of the dual problem as a lower bound to the primal problem.
What stalls your primal approach is perhaps the same issue. The Problem degenerated and the simplex algorithm is getting stuck at one variable which has no influence to the objective, therefor it's hard to find an optimal value for this variable.

Fastest min-cut (max-flow) algorithm in actual practice on small weighted DAG

I want to solve the min-cut problem on a lot of small DAGS (8-12 nodes,20-60 edges) very quickly. It looks like the best solution is to solve the max-flow and deduce a cut from that. There are quite a few max-flow algorithms with both theoretical and empirical timing comparisons available, but these all assume what's interesting is performance as the graphs get larger and larger. It's also often mentioned that set-up times for complicated data structures used can be quite big. So given a careful, optimized implementation (probably in C++) which algorithm turns out to be fastest for initialising and running on small graphs? (My naive assumption is that Edmonds-Karp is probably as simple in terms of data-structures so will beat more complicated algorithms, but that's just a guesstimate.)

Is there any free ITERATIVE linear system solver in c++ that allows me to feed in an arbitrary initial guess?

I am looking for an iterative linear system solver to calculate a continuously changing field. For the simulation to work properly, I need to re-calculate the field (maybe several times) for every time step. Fortunately, I have a good initial guess for each time step, so it is better I can feed it into an iterative solver. And the coefficient matrix is very dense.
The problem is I checked several iterative solvers online like Gmm++, IML++, ITL, DUNE/ISTL and so on. They are either for sparse systems or don't provide interfaces for inputting initial guesses (I might be wrong since I didn't have time to go through all the documents).
So I have two questions:
1 Is there any such c++ solver available online?
2 Since the coefficient matrix can be as large as thousands * thousands, could a direct solver be quicker than an iterative solver with a really good initial guess?
Great Thanks!
He
If you check the header for Conjugate Gradient in IML++ (http://math.nist.gov/iml++/cg.h.txt), you'll see that you can very easily provide the initial guess for the solution in the very variable where you'd expect to get the solution.

Is there an Integer Linear Programming software that returns also non-optimal solutions?

I have an integer linear optimisation problem and I'm interested in feasible, good solutions. As far as I know, for example the Gnu Linear Programming Kit only returns the optimal solution (given it exists).
This takes endless time and is not exactly what I'm looking for: I would be happy with any good solution, not only the optimal one.
So a LP-Solver that e.g. stops after some time and returns the best solution he found so far, would do the job.
Is there any such software? It would be great if that software was open source or at least free as in beer.
Alternatively: Is there any other way that usually speeds up Integer LP problems?
Is this the right place to ask?
Many solvers provide a time limit parameter; if you set the time limit parameter, they will stop once the time limit is reached. If an integer feasible solution has been found, it will return the best feasible solution found to that point.
As you may know, integer programming is NP-hard, and there is a real art to finding optimal solutions as well as good feasible solutions quickly. To compare the different solvers, see Prof. Hans Mittelmann's Benchmarks for Optimization Software. The MILP benchmarks - particularly MIPLIB2010 and the Feasibility Benchmark should be most relevant.
In addition to selecting a good solver, there are many things that can be done to improve solve times including tuning the parameters of the solver and model reformulation. Many people in research and industry - including myself - spend our careers working on improving the solve times of MIP models, both in general and for specific models.
If you are an academic user, note that the top commercial systems like CPLEX and Gurobi are free for academic use. See the respective websites for details.
Finally, you may want to look at OR-Exchange, a sister site to Stack Overflow that focuses on the field of operations research.
(Disclaimer: I currently work for Gurobi Optimization and formerly worked for ILOG, which provided CPLEX).
If you would like to get a feasibel integer solution fast and if you don't need the optimal solution, you can try
Increase the relative or absolute Gap. Usually solvers have small gaps of say 0.0001% for relative gap. This means that the solver will continue searching for MIP solutions until it the MIP solution is not farther than 0.0001% away from the optimal solution. Increase this gab to e.g. 1%., So you get good solution, but the solver will not spent a long time in proving optimality.
Try different values for solver parameters concerning MIP heuristics.
CPLEX and GUROBI have parameters to control, MIP emphasis. This means that the solver will put more emphasis on looking for feasible solutions or on proving optimality. Set emphasis to feasible MIP solutions.
Most solvers like CPLEX, Gurobi, MOPS or GLPK have settings for gap and heuristics. MIP emphasis can be set - as far as I know - only in CPLEX and Gurobi.
A usual approach for solving ILP is branch-and-bound. This utilized the solution of many sub-LP (without-I). The finally optimal result is the best of all sub-LP. As at least one solution is found you could stop anytime and would have a best-so-far.
One package that could do it, is the free lpsolve. Look there at set_timeout for giving a time limit, and when it is ILP the solve function can return in SUPOPTIMAL the best_so_far value.
As far as I know CPLEX can. It can return the solution pool which contains primal feasible solutions in the search, and if you specify the search focus on feasibility rather on optimality, more faesible solutions can be generated. At the end you can just export the pool. You can use the pool to do a hot start so it's pretty up to you. CPlex is free now at least in my country as you can sign up as a researcher.
Could you take into account Microsoft Solver Foundation? The only restriction is technology stack that you prefer and here you should use, as you guess, Microsoft technologies: C#, vb.net, etc. Here is example how to use it with Excel: http://channel9.msdn.com/posts/Modeling-with-Solver-Foundation-30 .
Regarding to your question it is possible to have not a fully optimized solutions if you set efficiency (for example 85% or 0.85). In outcome you can see all possible solutions for such restriction.

Least Squares Regression in C/C++

How would one go about implementing least squares regression for factor analysis in C/C++?
the gold standard for this is LAPACK. you want, in particular, xGELS.
When I've had to deal with large datasets and large parameter sets for non-linear parameter fitting I used a combination of RANSAC and Levenberg-Marquardt. I'm talking thousands of parameters with tens of thousands of data-points.
RANSAC is a robust algorithm for minimizing noise due to outliers by using a reduced data set. Its not strictly Least Squares, but can be applied to many fitting methods.
Levenberg-Marquardt is an efficient way to solve non-linear least-squares numerically.
The convergence rate in most cases is between that of steepest-descent and Newton's method, without requiring the calculation of second derivatives. I've found it to be faster than Conjugate gradient in the cases I've examined.
The way I did this was to set up the RANSAC an outer loop around the LM method. This is very robust but slow. If you don't need the additional robustness you can just use LM.
Get ROOT and use TGraph::Fit() (or TGraphErrors::Fit())?
Big, heavy piece of software to install just of for the fitter, though. Works for me because I already have it installed.
Or use GSL.
If you want to implement an optimization algorithm by yourself Levenberg-Marquard seems to be quite difficult to implement. If really fast convergence is not needed, take a look at the Nelder-Mead simplex optimization algorithm. It can be implemented from scratch in at few hours.
http://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method
Have a look at
http://www.alglib.net/optimization/
They have C++ implementations for L-BFGS and Levenberg-Marquardt.
You only need to work out the first derivative of your objective function to use these two algorithms.
I've used TNT/JAMA for linear least-squares estimation. It's not very sophisticated but is fairly quick + easy.
Lets talk first about factor analysis since most of the discussion above is about regression. Most of my experience is with software like SAS, Minitab, or SPSS, that solves the factor analysis equations, so I have limited experience in solving these directly. That said, that the most common implementations do not use linear regression to solve the equations. According to this, the most common methods used are principal component analysis and principal factor analysis. In a text on Applied Multivariate Analysis (Dallas Johnson), no less that seven methods are documented each with their own pros and cons. I would strongly recommend finding an implementation that gives you factor scores rather than programming a solution from scratch.
The reason why there's different methods is that you can choose exactly what you're trying to minimize. There a pretty comprehensive discussion of the breadth of methods here.