Linear Programming with Constraints

Linear Programming with Constraints - linear-programming

Is there any known algorithm to find a maximum when there is a constraint on the optimizing function. i.e. I am interested to find the maximum of
cTx
under the constraint
Ax <= b
however I also request that
cTx <= α
It looks similar to the simplex algorithm but I have an additional constraint on the maximizing cost.

The simplex algorithm can deal with any linear constraints and linear objective function. There is nothing special at all if some linear constraint constains the objective function. Any LP solver can do the trick! It might be handful though to add a new decision variable, equal to the objective function.

Related

Why Gurobi generates slack variables?

I have the following LP:
max -x4-x5-x6-x7
s.t x0+x1+x4=1
x2+x3+x5=1
x0+x2+x6=1
x1+x3+x7=1
Gurobi gives me the following base B=[1,0,2,10], in my model I have 8 variables and rank(A)=4, but in the base I have variable x10. My question is, why Gurobi generates slack variables even with rank(A)=4? And how to get an optimal base that contains only the original variables (variables from x0 to x7)?

The problem is degenerate. There are multiple optimal bases that give the same primal solution. In other words, some variables are basic at bound. This happens a lot in practice and you should not worry about that.
To make things more complicated: there are also multiple optimal primal solutions; so we say that we have both primal and dual degeneracy.
My question is, why Gurobi generates slack variables even with rank(A)=4?
The LP problem, for solvers like Gurobi (i.e. not a tableau solver), has n structural variables and m logical variables (a.k.a. slacks). The slack variables are implicit, they are not "generated" (in the sense that the matrix A is physically augmented with an identity matrix). Again, this is not something to worry about.
And how to get an optimal base that contains only the original variables (variables from x0 to x7)?
Well, this is an optimal basis. So why would Gurobi spend time and do more pivots to try to make all slacks non-basic? AFAIK no solver would do that. They treat structural and logical variables as equals.
It is not so easy to force variables to be in the basis. A free variable will most likely be in the (optimal) basis (but no 100% guarantee). You can also specify an advanced basis for Gurobi to start from. In the extreme: if this advanced basis is optimal (and feasible) Gurobi will not do any pivots.
I believe this particular problem has 83 optimal (and feasible) bases. Only one of them has all slacks NB. I don't think it is easy to find this solution, even if you would have access to a Simplex code and can change it so (after finding an optimal solution) it continues pivoting slacks out of the basis. I think you would need to enumerate optimal bases (explicitly or implicitly).

Slack variables are generated because it is necessary to solve for a dual linear programming model and reduction bound. A complementary slackness; Si would also be generated.
Moreover, X0 forms a branch for linear independence of a dual set-up. X0 has the value of 4, slack variables are generated from the transformation of the basis; where the rank is given, to the branch X0 for linear independence.
A reduction matrix is formed to find the value of X10 which has the value of 5/2.
This helps to eliminate X10 inorder to get an optimal base from the reduction matrix.

Pivoting in simplex method for solving linear programming

A common linear programming problem below
min c'x
s.t. Ax<=b
(A is m*n, m is smaller than n)
As I know, the pivoting procedure in simplex method lets extreme point jump to another extreme point until it finds the optimal solution.
Extreme point has at most m(the number of constraints) nonzero variables. Variables in extreme point can be divided into two parts, basic variables(nonzero terms) and nonbasic variables(zero terms).
In normal condition, Pivoting change one nonbasic variable to basic variable while one basic variable become nonbasic variable in each iteration.
My question is can a nonbasic variable which was basic before become basic again? If yes, is there an clear or special example that at least one variable does.

What does cv::TermCriteria() exactly do in OpenCV?

Official documentation said that TermCriteria(int Type, int MaxCount, double epsilon) is for defining termination criteria for iterative algorithms. And criteria type can be one of: COUNT, EPS or COUNT + EPS.
However, I can't quite understand what SVM does different in each iteration when I use svm->setTermCriteria(const cv::TermCriteria & val).

SVM training can be considered a large-scale quadratic programming problem and unfortunately it cannot be easily solved with standartd QP techniques. That's why in the past several years numerous decomposition algorithms were introduced in the literature.
The basic idea of these algorithms is to decompose the large QP problem into a series of smaller QP sub-problems. That is, in each iteration the algorithm keeps fixed most dimension of the optimization variable vector and varies a small subset of dimensions (namely working set) to get maximumal reduction of the object function.
As far as I know, OpenCV uses the SVMlight or generalized SMO algorithm and the TermCriteria parameter is a termination criteria of the iterative SVM training procedure which solves the above mentioned partial case of constrained QP problem. You can specify tolerance and/or the maximum number of iterations.

How to create a vector containing a (artificially generated) Guassian (normal) distribution?

If I have data (a daily stock chart is a good example but it could be anything) in which I only know the range (high - low) that X units sold within but I don't know the exact price at which any given item sold. Assume for simplicity that the price range contains enough buckets (e.g. forty one-cent increments for a 40 cent range) to make such a distribution practical. How can I go about distributing those items to form a normal bell curve stored in a vector? It doesn't have to be perfect but realistic.
My (very) naive thinking has been to assume that since random numbers should form a normal distribution I can do something like have a binary RNG. If, for example, there are forty buckets then if a '0' comes up 40 times the 0th bucket gets incremented and if a '1' comes up for times in a row then the 39th bucket gets incremented. If '1' comes up 20 times then it is in the middle of the vector. Do this for each item until X units have been accounted for. This may or may not be right and in any case seems way more inefficient than necessary. I am looking for something more sensible.
This isn't homework, just a problem that has been bugging me and my statistics is not up to snuff. Most literature seems to be about analyzing the distribution after it already exists but not much about how to artificially create one.
I want to write this in c++ so pre-packaged solutions in R or matlab or whatnot are not too useful for me.
Thanks. I hope this made sense.

Most literature seems to be about analyzing the distribution after it already exists but not much about how to artificially create one.
There's tons of literature on how to create one. The Box–Muller transform, the Marsaglia polar method (a variant of Box-Muller), and the Ziggurat algorithm are three. (Google those terms). Both Box-Muller methods are easy to implement.
Better yet, just use a random generator that already exists that implements one of these algorithms. Both boost and the new C++11 have such packages.

The algorithm that you describe relies on the Central Limit Theorem that says that a random variable defined as the sum of n random variables that belong to the same distribution tends to approach a normal distribution when n grows to infinity. Uniformly distributed pseudorandom variables that come from a computer PRNG make a special case of this general theorem.
To get a more efficient algorithm you can view probability density function as a some sort of space warp that expands the real axis in the middle and shrinks it to the ends.
Let F: R -> [0:1] be the cumulative function of the normal distribution, invF be its inverse and x be a random variable uniformly distributed on [0:1] then invF(x) will be a normally distributed random variable.
All you need to implement this is be able to compute invF(x). Unfortunately this function cannot be expressed with elementary functions. In fact, it is a solution of a nonlinear differential equation. However you can efficiently solve the equation x = F(y) using the Newton method.
What I have described is a simplified presentation of the Inverse transform method. It is a very general approach. There are specialized algorithms for sampling from the normal distribution that are more efficient. These are mentioned in the answer of David Hammen.

Best-First search in Boost Graph Library

I am starting to work with boost graph library. I need a best-first search, which I could implement using astar_search by having zero costs. (Please correct me if I'm wrong.)
However, I wonder if there is another possibility of doing this? If the costs are not considered, the algorithm should be slightly more efficient.
EDIT: Sorry for the unclear description. I am actually implementing a potential field search, so I don't have any costs/weights associated with the edges but rather need to do a steepest-descent-search (which can overcome local minima).
Thanks for any hints!

You could definitely use A* to tackle this; you'd need h(x) to be 0 though, not g(x). A* rates nodes based on F which is defined by
F(n) = g(n) + h(n).
TotalCost = PathCost + Heuristic.
g(n) = Path cost, the distance from the initial to the current state
h(n) = Heuristic, the estimation of cost from current state to end state.
From Wikipedia:
Dijkstra's algorithm, as another
example of a best-first search
algorithm, can be viewed as a special
case of A* where h(x) = 0 for all x.

If you are comfortable with C++, I would suggest trying out YAGSBPL.

As suggested by Aphex's answer, you might want to use Dijkstra's algorithm; one way to set the edge weights is to set w(u, v) to potential(v) - potential(u), assuming that is nonnegative. Dijkstra's algorithm assumes that edge weights are positive and so that distances increase as you move away from the source node. If you are searching for the smallest potential, flip the sides of the subtraction; if you have potentials that go both up and down you might need to use something like Bellman-Ford which is not best-first.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js