Maximizing a Ratio/Percent - linear-programming

I'm using cvxpy to model a problem.
Inside a very large and complex LP, I create two continuous, affine (unconstrained) expressions:x and y.
Due to how they are created, I know that 0 < x < y <= U. Obviously: x/y < 1.
In my LP objective, how do I maximize the ratio x/y?
Things I tried:
Maximizing x*cp.inv_pos(Y) states my problem is non DCP (also if I try to minimize the inverse)
I found various LP formulations for maximizing ratios (e.g. here or here) but these requires rewriting the constraints on all the terms in the expressions for x - I have no idea how to do that with cvxpy.
If this is the way to go then an example would be very helpful!

Related

Histogram Binning of Gradient Vectors

I am working on a project that has a small component requiring the comparison of distributions over image gradients. Assume I have computed the image gradients in the x and y directions using a Sobel filter and have for each pixel a 2-vector. Obviously getting the magnitude and direction is reasonably trivial and is as follows:
However, what is not clear to me is how to bin these two components in to a two dimensional histogram for an arbitrary number of bins.
I had considered something along these lines(written in browser):
//Assuming normalised magnitudes.
//Histogram dimensions are bins * bins.
int getHistIdx(float mag, float dir, int bins) {
const int magInt = reinterpret_cast<int>(mag);
const int dirInt = reinterpret_cast<int>(dir);
const int magMod = reinterpret_cast<int>(static_cast<float>(1.0));
const int dirMod = reinterpret_cast<int>(static_cast<float>(TWO_PI));
const int idxMag = (magInt % magMod) & bins
const int idxDir = (dirInt % dirMod) & bins;
return idxMag * bins + idxDir;
}
However, I suspect that the mod operation will introduce a lot of incorrect overlap, i.e. completely different gradients getting placed in to the same bin.
Any insight in to this problem would be very much appreciated.
I would like to avoid using any off the shelf libraries as I want to keep this project as dependency light as possible. Also I intend to implement this in CUDA.
This is more of a what is an histogram question? rather than one of your tags. Two things:
In a 2D plain two directions equal by modulation of 2pi are in fact the same - so it makes sense to modulate.
I see no practical or logical reason of modulating the norms.
Next, you say you want a "two dimensional histogram", but return a single number. A 2D histogram, and what would make sense in your context, is a 3D plot - the plane is theta/R, 2 indexed, while the 3D axis is the "count".
So first suggestion, return
return Pair<int,int>(idxMag,idxDir);
Then you can make a 2D histogram, or 2 2D histograms.
Regarding the "number of bins"
this is use case dependent. You need to define the number of bins you want (maybe different for theta and R). Maybe just some constant 10 bins? Maybe it should depend on the amount of vectors? In any case, you need a function that receives either the number of vectors, or the total set of vectors, and returns the number of bins for each axis. This could be a constant (10 bins) initially, and you can play with it. Once you decide on the number of bins:
Determine the bins
For a bounded case such as 0<theta<2 pi, this is easy. Divide the interval equally into the number of bins, assuming a flat distribution. Your modulation actually handles this well - if you would have actually modulated by 2*pi, which you didn't. You would still need to determine the bin bounds though.
For R this gets trickier, as this is unbounded. Two options here, but both rely on the same tactic - choose a maximal bin. Either arbitrarily (Say R=10), so any vector longer than that is placed in the "longer than max" bin. The rest is divided equally (for example, though you could choose other distributions). Another option is for the longest vector to determine the edge of the maximal bin.
Getting the index
Once you have the bins, you need to search the magnitude/direction of the current vector in your bins. If bins are pairs representing min/max of bin (and maybe an index), say in a linked list, then it would be something like (for mag for example):
bin = histogram.first;
while ( mag > bin.min ) bin = bin.next;
magIdx = bin.index;
If the bin does not hold the index you can just use a counter and increase it in the while. Also, for the magnitude the final bin should hold "infinity" or some large number as a limit. Note this has nothing to do with modulation, though that would work for your direction - as you have coded. I don't see how this makes sense for the norm.
Bottom line though, you have to think a bit about what you want. In any case all the "objects" here are trivial enough to write yourself, or even use small arrays.
I think you should arrange your bins in a square array, and then bin by vx and vy independently.
If your gradients are reasonably even you just need to scan the data first to accumulate the min and max in x and y, and then split the gradients evenly.
If the gradients are very unevenly distributed, you might want to sort the (eg) vx first and arrange that the boundaries between each bin exactly evenly divides the values.
An intermediate solution might be to obtain the min and max ignoring the (eg) 10% most extreme values.

Backpropagation 2-Dimensional Neuron Network C++

I am learning about Two Dimensional Neuron Network so I am facing many obstacles but I believe it is worth it and I am really enjoying this learning process.
Here's my plan: To make a 2-D NN work on recognizing images of digits. Images are 5 by 3 grids and I prepared 10 images from zero to nine. For Example this would be number 7:
Number 7 has indexes 0,1,2,5,8,11,14 as 1s (or 3,4,6,7,9,10,12,13 as 0s doesn't matter) and so on. Therefore, my input layer will be a 5 by 3 neuron layer and I will be feeding it zeros OR ones only (not in between and the indexes depends on which image I am feeding the layer).
My output layer however will be one dimensional layer of 10 neurons. Depends on which digit was recognized, a certain neuron will fire a value of one and the rest should be zeros (shouldn't fire).
I am done with implementing everything, I have a problem in computing though and I would really appreciate any help. I am getting an extremely high error rate and an extremely low (negative) output values on all output neurons and values (error and output) do not change even on the 10,000th pass.
I would love to go further and post my Backpropagation methods since I believe the problem is in it. However to break down my work I would love to hear some comments first, I want to know if my design is approachable.
Does my plan make sense?
All the posts are speaking about ranges ( 0->1, -1 ->+1, 0.01 -> 0.5 etc ), will it work for either { 0 | .OR. | 1 } on the output layer and not a range? if yes, how can I control that?
I am using TanHyperbolic as my transfer function. Does it make a difference between this and sigmoid, other functions.. etc?
Any ideas/comments/guidance are appreciated and thanks in advance
Well, by the description given above, I think that the design and approach taken it's correct! With respect to the choice of the activation function, remember that those functions help to get the neurons which have the largest activation number, also, their algebraic properties, such as an easy derivative, help with the definition of Backpropagation. Taking this into account, you should not worry about your choice of activation function.
The ranges that you mention above, correspond to a process of scaling of the input, it is better to have your input images in range 0 to 1. This helps to scale the error surface and help with the speed and convergence of the optimization process. Because your input set is composed of images, and each image is composed of pixels, the minimum value and and the maximum value that a pixel can attain is 0 and 255, respectively. To scale your input in this example, it is essential to divide each value by 255.
Now, with respect to the training problems, Have you tried checking if your gradient calculation routine is correct? i.e., by using the cost function, and evaluating the cost function, J? If not, try generating a toy vector theta that contains all the weight matrices involved in your neural network, and evaluate the gradient at each point, by using the definition of gradient, sorry for the Matlab example, but it should be easy to port to C++:
perturb = zeros(size(theta));
e = 1e-4;
for p = 1:numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (2*e);
perturb(p) = 0;
end
After evaluating the function, compare the numerical gradient, with the gradient calculated by using backpropagation. If the difference between each calculation is less than 3e-9, then your implementation shall be correct.
I recommend to checkout the UFLDL tutorials offered by the Stanford Artificial Intelligence Laboratory, there you can find a lot of information related to neural networks and its paradigms, it's worth to take look at it!
http://ufldl.stanford.edu/wiki/index.php/Main_Page
http://ufldl.stanford.edu/tutorial/

How to prevent Gurobi from converting a Max into a Min prob?

I am using Gurobi (via C++) as part of my MSc thesis to solve Quadratic Knapsack Problem instances. So far, I was able to generate a model with binary decision variables, quadratic objective function and the capacity constraint and Gurobi solved it just fine. Then I wanted to solve the continuous relaxation of the QKP. I built the model as before but with continuous variables instead of binary ones and Gurobi threw me an exception when I tried to optimize it:
10020 - Objective Q not PSD (negative diagonal entry)
Which confounded me for a bit since all values form the problem instance are ≥0. In preparing to post this question, I wrote both models out to file and discovered the reason:
NAME (null)
* Max problem is converted into Min one
Which of course means that all previously positive values are now negative. Now I know why Q is not PSD but how do I fix this? Can I prevent the conversion from a Max problem into a Min one? Do I need to configure the model for the continuous relaxation differently?
From my (unexperienced) point of view it just looks like Gurobi shot itself in the foot.
When you are maximizing a quadratic objective with Gurobi or any other convex optimizer, your 'Q' matrix has to be negative semi-definite, and when you are minimizing, your 'Q' matrix needs to be positive definite. Changing the sign and the sense of the objective changes nothing.
Gurobi doesn't verify that your problem is convex, but it will report any non-convexity it finds. The fact that your original problem seemed to solve as a MIP is an accident and you shouldn't rely on it.
You should model a quadratic objective with binary variables as a linear problem with some simple transformations. If x and y are binary, the expression x*y can be changed to z if you add the constraints
z <= x
z <= y
z >= x + y - 1

OpenCV, C++: Distance between two points

For a group project, we are attempting to make a game, where functions are executed whenever a player forms a set of specific hand gestures in front of a camera. To process the images, we are using Open-CV 2.3.
During the image-processing we are trying to find the length between two points.
We already know this can be done very easily with Pythagoras law, though it is known that Pythagoras law requires much computer power, and we wish to do this as low-resource as possible.
We wish to know if there exist any build-in function within Open-CV or standard library for C++, which can handle low-resource calculations of the distance between two points.
We have the coordinates for the points, which are in pixel values (Of course).
Extra info:
Previous experience have taught us, that OpenCV and other libraries are heavily optimized. As an example, we attempted to change the RGB values of the live image feed from the camera with a for loop, going through each pixel. This provided with a low frame-rate output. Instead we decided to use an Open-CV build-in function instead, which instead gave us a high frame-rate output.
You should try this
cv::Point a(1, 3);
cv::Point b(5, 6);
double res = cv::norm(a-b);//Euclidian distance
As you correctly pointed out, there's an OpenCV function that does some of your work :)
(Also check the other way)
It is called magnitude() and it calculates the distance for you. And if you have a vector of more than 4 vectors to calculate distances, it will use SSE (i think) to make it faster.
Now, the problem is that it only calculate the square of the powers, and you have to do by hand differences. (check the documentation). But if you do them also using OpenCV functions it should be fast.
Mat pts1(nPts, 1, CV_8UC2), pts2(nPts, 1, CV_8UC2);
// populate them
Mat diffPts = pts1-pts2;
Mat ptsx, ptsy;
// split your points in x and y vectors. maybe separate them from start
Mat dist;
magnitude(ptsx, ptsy, dist); // voila!
The other way is to use a very fast sqrt:
// 15 times faster than the classical float sqrt.
// Reasonably accurate up to root(32500)
// Source: http://supp.iar.com/FilesPublic/SUPPORT/000419/AN-G-002.pdf
unsigned int root(unsigned int x){
unsigned int a,b;
b = x;
a = x = 0x3f;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
x = (x+a)>>1;
return(x);
}
This ought to a comment, but I haven't enough rep (50?) |-( so I post it as an answer.
What the guys are trying to tell you in the comments of your questions is that if it's only about comparing distances, then you can simply use
d=(dx*dx+dy*dy) = (x1-x2)(x1-x2) + (y1-y2)(y1-y2)
thus avoiding the square root. But you can't of course skip the square elevation.
Pythagoras is the fastest way, and it really isn't as expensive as you think. It used to be, because of the square-root. But modern processors can usually do this within a few cycles.
If you really need speed, use OpenCL on the graphics card for image processing.

count of distinct acyclic paths from A[a,b] to A[c,d]?

I'm writing a sokoban solver for fun and practice, it uses a simple algorithm (something like BFS with a bit of difference).
now i want to estimate its running time ( O and omega). but need to know how to calculate count of acyclic paths from a vertex to another in a network.
actually I want an expression that calculates count of valid paths, between two vertices of a m*n matrix of vertices.
a valid path:
visits each vertex 0 or one times.
have no circuits
for example this is a valid path:
alt text http://megapic.ir/images/f1hgyp5yxcu8887kfvkr.png
but this is not:
alt text http://megapic.ir/images/wnnif13ir5gaqwvnwk9d.png
What is needed is a method to find count of all acyclic paths between the two vertices a and b.
comments on solving methods and tricks are welcomed.
Not a solution but maybe you can think this idea a bit further. The problem is that you'll need to calculate also the longest possible path to get all paths. The longest path problem is NP complete for general graphs, so it will get a very long time even for relative small graphs (8x8 and greater).
Imagine the start-vertex is in the top, left corner and the end vertex is in the lower, right corner of the matrix.
For a 1x2 matrix there is only 1 possible path
2x2 matrix => 2***1** paths => 2
3x2 matrix => 2***2** paths => 4
3x3 matrix => 2***4** + 2*2 paths => 12
3x4 matrix => 2***12** + 12 + 2 paths => 38
Everytime I combined the results from the previous calculation for the current number of paths. It could be that there is a close formular for such a planar graph, maybe there is even a lot of theory for that, but I am too stupid for that ...
You can use the following Java (sorry, I am not a c++ expert :-/) snippet to calculate possible paths for larger matrices:
public static void main(String[] args) {
new Main(3, 2).start();
}
int xSize;
int ySize;
boolean visited[][];
public Main(int maxX, int maxY) {
xSize = maxX;
ySize = maxY;
visited = new boolean[xSize][ySize];
}
public void start() {
// path starts in the top left corner
int paths = nextCell(0, 0);
System.out.println(paths);
}
public int nextCell(int x, int y) {
// path should end in the lower right corner
if (x == xSize - 1 && y == ySize - 1)
return 1;
if (x < 0 || y < 0 || x >= xSize || y >= ySize || visited[x][y]) {
return 0;
}
visited[x][y] = true;
int c = 0;
c += nextCell(x + 1, y);
c += nextCell(x - 1, y);
c += nextCell(x, y + 1);
c += nextCell(x, y - 1);
visited[x][y] = false;
return c;
}
=>
4x4 => 184
5x5 => 8512
6x6 => 1262816
7x7 (even this simple case takes a lot of time!) => 575780564
This means you could (only theoretically) compute all possible paths from any position of a MxM matrix to the lower, right corner and then use this matrix to quickly look up the number of paths. Dynamic programming (using previous calculated results) could speed up things a bit.
The general problem of counting the number of simple paths in a graph is #P complete. Some #P-complete problems have fully polynomial randomized approximation schemes, and some don't, but you claim not to be interested in an approximation. Perhaps there's a way to take advantage of the grid structure, as there is for computing the Tutte polynomial, but I don't have any ideas for how to do this.
There is a similar but less general problem on project Euler: http://projecteuler.net/index.php?section=problems&id=237
I think some of the solutions described in the forum there can be extended to solve your general case. It's a pretty difficult problem though, especially for your general case.
To get access to their forums, you first need to solve the problem. I won't post the answer here, nor link to a certain site that lists the answer, a site that you can easily find on google by searching for something really obvious.
This is an open question in Mathematics with direct application to chemistry and physics in using it to model polymer bonds. Some of the earliest work done on this was done during the Manhattan project (nuclear bomb WWII.)
It is better known as the Self Avoiding Walk problem.
I spent a summer at my university mathematics department researching a monte-carlo algorithm called the pivot algorithm to approximate the parameters of the asymptotic fit of the number of Self-Avoiding Walks of a given length n.
Please refer to Gordon Slade's excellent book titled "The Self Avoiding Walk" for extensive coverage of the types of techniques used to approach this problem thus far.
This is a very complex problem and I wonder what your motivation may be for considering it. Perhaps you can find a simpler model for what you want, because Self Avoiding Walks are not simple at all.
Would a matrix showing the edges work? Consider building a Matrix showing where the edges are,i.e. [a,b]=1 <=> a->b is an edge in the graph, 0 otherwise. Now, raise this Matrix to various powers to show how many ways exist to get between vertices using n steps and then sum them to get the result. This is just an idea of one way to solve the problem, there may be other ways to frame the problem.
I wonder if this belongs on MathOverflow, as another idea
True, that once you have a zero matrix you could stop exponentiating as in your case, there aren't many places to go after 3, but the paths from 1 to 3 would be the direct one and the one that goes through 2, so there are only a few matrices to add together before the all zero one is found.
I'd think there should be a way to compute a bound of say n^(n+1), where n is the number of vertices in the graph, that would indicate a stopping point as by that point, every vertex will have been visited once. I'm not sure how to get the cyclic paths out of the problem though, or can one assume that the graph is free of cycles?