searching for a dedicated binary linear programming algorithm - linear-programming

There are various linear programming solvers out there, among which lpSolve or GLPK. These are general LP solvers that are suitable for many purposes, but is there a dedicated LP solver for the binary case?
To exemplify, my problem involves finding the minimum number of lines that cover all columns of a matrix, something like:
0 1 0 1
1 0 0 1
1 0 1 0
It can be seen the first and third rows cover all four columns, the second row being irrelevant. Denoting the three lines as A, B and C, the problem boils down to solving a system of inequations:
B + C >= 1
A >= 1
C >= 1
A + B >= 1
This gives the solution:
A = 1; B = 0; C = 1
(which means the minimum number of lines is 2).
Thanks in advance,
Adrian

Related

Efficient algorithms to check if a binary maze is solvable with restricted moves

I am given a problem to generate binary mazes of dimensions r x c (0/false for blocked cell and 1/true for free cell). Each maze is supposed to be solvable.
One can move from (i, j) to either (i + 1, j)(down) or (i, j + 1)(right). The solver is expected to reach (r - 1, c - 1)(last cell) starting from (0, 0)(first cell).
Below is my algorithm (modified BFS) to check if a maze is solvable. It runs in O(r*c) time complexity. I am trying to get a solution in better time complexity. Can anyone suggest me some other algorithm?? I don't want the path, I just want to check.
#include <iostream>
#include <queue>
#include <vector>
const int r = 5, c = 5;
bool isSolvable(std::vector<std::vector<bool>> &m) {
if (m[0][0]) {
std::queue<std::pair<int, int>> q;
q.push({0, 0});
while (!q.empty()) {
auto p = q.front();
q.pop();
if (p.first == r - 1 and p.second == c - 1)
return true;
if (p.first + 1 < r and m[p.first + 1][p.second])
q.push({p.first + 1, p.second});
if (p.second +1 < c and m[p.first][p.second + 1])
q.push({p.first, p.second + 1});
}
}
return false;
}
int main() {
char ch;
std::vector<std::vector<bool>> maze(r, std::vector<bool>(c));
for (auto &&row : maze)
for (auto &&ele : row) {
std::cin >> ch;
ele = (ch == '1');
}
std::cout << isSolvable(maze) << std::endl;
return 0;
}
Recursive Solution:
bool exploreMaze(std::vector<std::vector<bool>> &m, std::vector<std::vector<bool>> &dp, int x = 0, int y = 0) {
if (x + 1 > r or y + 1 > c) return false;
if (not m[x][y]) return false;
if (x == r - 1 and y == c - 1) return true;
if (dp[x][y + 1] and exploreMaze(m, dp, x, y + 1)) return true;
if (dp[x + 1][y] and exploreMaze(m, dp, x + 1, y)) return true;
return dp[x][y] = false;
}
bool isSolvable(std::vector<std::vector<bool>> &m) {
std::vector<std::vector<bool>> dp(r + 1, std::vector<bool>(c + 1, true));
return exploreMaze(m, dp);
}
Specific need:
I aim to use this function many times in my code: changing certain point of the grid, and then rechecking if that changes the result. Is there any possibility of memoization so that the results generated in a run can be re-used? That could give me better average time complexity.
If calling this function many times with low changes there's a data structure called Link-Cut tree which supports the following operations in O(log n) time:
Link (Links 2 graph nodes)
Cut (Cuts given edge from a graph)
Is Connected? (checks if 2 nodes are connected by some edges)
Given that a grid is an implicit graph we can first build Link-Cut tree, in O(n*m*log(n*m)) time
Then all updates (adding some node/deleting some node) can be done by just deleting/adding neighboring edges which will only take O(log(n*m)) time
Though I suggest optimizing maze generation algorithm instead of using this complicated data structure. Maze generation can be done with DFS quite easily
The problem you are looking at is known as Dynamic Connectivity and as #Photon said, as you have an acyclic graph one solution is to use Link-cut tree. Another one is based on another representation as Euler tour.
You cannot go below O(r*c) in the general case because, with any pathfinding strategy, there is always a special case of a maze where you need to traverse a rectangular subregion of dimensions proportional to r and c before finding the correct path.
As for memoization: there is something you can do, but it might not help that much. You can build a copy of the maze but only keeping the valid paths, and putting in each cell the direction towards the previous and next cells, as well as the number of paths that traverse it. Let me illustrate.
Take the following maze, and the corresponding three valid paths:
1 1 1 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0 0
0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 0
0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0
1 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0
0 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1
You can build what I'll call the forward direction grid (FDG), the backward direction grid (BDG), and the valuation grid:
R B D N N B L L N N 3 3 1 0 0
N B R D N N U B L N 0 2 2 2 0
N D N D N N U N U N 0 1 0 2 0
N D N D N N U N U N 0 1 0 2 0
N R R R B N U L B L 0 1 1 3 3
R = right, D = down, L = left, U = up, B = both, and N = none.
The FDG tells you, in each cell, in what direction is the next cell on a valid path (or if both are). The BDG is the same thing in reverse. The valuation grid tells you how many valid paths contain each cell.
For convenience, I'm putting a B at the destination in the direction grids. You can see it as if the goal was to exit the maze, and to do so, you can go in either direction from the final cell. Note that there are always the same number of B cells, and that it's exactly the number of valid paths.
The easiest way to get these grids is to build them during a depth-first search. In fact, you can even use the BDG for the depth-first search since it contains backtracking information.
Now that you have these, you can block or free a cell and update the three grids accordingly. If you keep the number of valid paths separately as well, you can update it at the same time and the condition "the maze is solvable" becomes "the number of valid paths is not zero". Also note that you can combine both direction grids, but I find them easier to grasp separately.
To update the grids and the number of valid paths, there are three cases:
(A) you blocked a cell that was marked N; you don't need to do anything.
(B) you blocked a cell that was not marked N, so previously part of at least one valid path; decrement the number of valid paths by the cell's value in the valuation grid, and update all three grids accordingly.
(C) you freed a cell (that was necessarily marked N); update all three grids first and then increment the number of valid paths by the cell's new value in the valuation grid.
Updating the grids is a bit tricky, but the point is that you do not need to update every cell.
In case (B), if the number of valid paths hits zero, you can reset all three grids. Otherwise, you can use the FDG to update the correct cells forward until you hit the bottom-right, and the BDG to update the correct ones backward until you hit the top-left.
In case (C), you can update the direction grids first by doing a depth-first search, both forward and backward, and backtrack as soon as you hit a cell that isn't marked N (you need to update this cell as well). Then, you can make two sums of the values, in the valuation grid, of the cells you hit: one going forward and one going backward. The number of paths going through the new cell is the product of these two sums. Next, you can update the rest of the valuation grid with the help of the updated direction grids.
I would imagine this technique having an effect on performance with very large mazes, if the updates to the maze itself do not create or break too many paths every time.

2-D plane division

Here is the problem statement:
*Chef is working with lines on a 2-D plane. He knows that every line on a plane can be clearly defined by three coefficients A, B and C: any point (x, y) lies on the line if and only if A * x + B * y + C = 0. Let's call a set of lines to be perfect if there does not exist a point that belongs to two or more distinct lines of the set. He has a set of lines on a plane and he wants to find out the size of the largest perfect subset of this set.
Input
The first line of input contains one integers T denoting the number of test cases. Each test case consists of one integer N denoting number of lines. Next N lines contain 3 space-separated integers each denoting coefficients A, B and C respectively.
Output
For each test case output the cardinality of the largest perfect subset in a single line. Constraints
Input:
1 5
1 1 0
1 2 3
3 4 5
30 40 0
30 40 50
Output: 2 Explanation
Lines 3*x + 4*y + 5 = 0 and 30*x + 40*y + 0 = 0 form a biggest perfect subset.*
So if the ratios of As and Bs are the same, then the lines would be parallel which fulfills the problem statement. For example: if A[1] / B[1] == A[2] / B[2] then these line one and line two are parallel. But when the two lines in question are the same lines, which means there are an infinite number of common points, this equation holds, which is not what the problem wants. So we need to use C to determine whether the lines are the same or not (i.e. A[1]/A[2] == B[1]/B[2] == C[1]/C[2]). But the code I wrote with these ideas are so inefficient. Can you all suggest a more time-efficient solution?
You can write a linear algorithm for this.
The idea is to have a map, where the key is a direction and the value is a set.
For each direction, the set contains only different lines which have the given direction. Then the answer is the size of the larger set.
The direction of a line Ax + By + C = 0 is A/B. The problem is that if B=0 it won't quite work as a key.
You can have a special set for the case B=0, which you keep separate and don't insert into the map.
The values that you insert into the set for a given line Ax + By + C = 0, should be C/B.
In the special case, when B = 0, you should use C/A.

Integer Linear Programming formulation for Test Cover?

The Test Cover problem can be defined as follows:
Suppose we have a set of n diseases and a set of m tests we can perform to check for symptoms. We also are given the following:
an nxn matrix A where A[i][j] is a binary value representing the result of running the jth test on a patient with the the ith disease (1 indicates a positive result, 0 indicates negative);
the cost of running test j, c_j; and that
any patient will have exactly one disease
The task is to find a set of tests that can uniquely identify each of the the n diseases at minimal cost.
This problem can be formulated as an Integer Linear Program, where we want to minimize the objective function \sum_{j=1}^{m} c_j x_j, where x_j = 1 if we choose to include test j in our set, and 0 otherwise.
My question is:
What is the set of linear constraints for this problem?
Incidentally, I believe this problem is NP-hard (as is Integer Linear Programming in general).
Well if I am correct you just need to ensure
\sum_j x_j.A_ij >= 1 forall i
Let T be the matrix that results from deleting the jth column of A for all j such that x_j = 0.
Then choosing a set of tests that can uniquely distinguish any two diseases is equivalent to ensuring that every row of T is unique.
Observe that two rows k and l are identical if and only if (T[k][j] XOR T[l][j]) = 0 for all j.
So, the constraints we want are
\sum_{j=1}^{m} x_j(A[k][j] XOR A[l][j]) >= 1
for all 1 <= k <= m and 1 <= l <= 1 such that k != l.
Note that the constraints above are linear, since we can just pre-compute the coefficient (A[k][j] XOR A[l][j]).

How to do a set difference, except without eliminating repeated elements

I am trying to do the following in Matlab. Take two lists of numbers, possibly containing repeated elements, and subtract one set from the other set.
Ex: A=[1 1 2 4]; B=[1 2 4];
Desired result would be A-B=C=[1]
Or, another example, E=[3 3 5 5]; F=[3 3 5];
Desired result would be E-F=G=[5]
I wish I could do this using Matlab's set operations, but their function setdiff does not respect the repeated elements in the matrices. I appreciate that this is correct from a strict set theory standpoint, but would nevertheless like to tackle problems like: "I have 3 apples and 4 oranges, and you take 2 apples and 1 orange, how many of each do I have left." My range of possible values in these sets is in the thousands, so building a large matrix for tallying elements and then subtracting matrices does not seem feasible for speed reasons. I will have to do thousands of these calculations with thousands of set elements during a gui menu operation.
Example of what I would like to avoid for tackling the second example above:
E=[0 0 2 0 2]; F=[0 0 2 0 1];
G=E-F=[0 0 0 0 1];
Thanks for your help!
This can be done with the accumarray command.
A = [1 1 2 4]';
B = [1 2 4]'; % <-make these column vectors
X = accumarray(A, 1);
Y = accumarray(B, 1);
This will produce the output
X = [2 1 0 1]'
and
Y = [1 1 0 1]'
Where X(i) represents the number of incidents of the number i, in vector A, and Y(i) represents the number of incidents of number i in vector B.
Then you can just take X - Y.
One caveat: if the maximum values of A and B are different, the output from accummarray will have different lengths. If that is the case, you can just assign the output to be a subset of a vector of zeros that is the size of the larger vector.
I just want to improve on Prototoast's answer.
In order to avoid pitfalls involving non-positive numbers in A or B use hist:
A = [-10 0 1 1 2 4];
B = [1 2 4];
We need the minimum and maximum values in the union of A and B:
U = [A,B];
range_ = min(U):max(U);
So that we can use hist to give us same length vectors:
a = hist(A,range_)
b = hist(B,range_)
Now you need to subtract the histograms:
r = a-b
If you wish the set difference operator be symmetric then use:
r = abs(a-b)
The following will give you which items are in A \ B (\ here is your modified set difference):
C = range_(logical(r))
Hope this helps.

Logical Question

Consider a [4x8] matrix "A" and [1x8] matrix "B". I need to check if there exists a value "X" such that
[A]^T * [X] = [B]^T exists for any x >= 0 { X is a [4X1] matrix, T = transpose }
Now here is the trick/tedious part. The matrix A always has 1 as its diagonal. A11,A22,A33,A44 = 1 This matrix can be considered as two halves with first half being the first 4 columns and the second half being the second 4 columns like something below :
1 -1 -1 -1 1 0 0 1
A = -1 1 -1 0 0 1 0 0
-1 -1 1 0 1 0 0 0
-1 -1 -1 1 1 1 0 0
Each row in the first half can have either two or three -1's and if it has two -1's then that corresponding row in the second half should have one "1" or if any row has three -1's the second half of the matrix should have two 1's. The overall objective is to have the sum of each row to be 0.
Now B is a [1x8] matrix which can also be considered as two halves as follows:
B = -1 -1 0 0 0 0 1 1
Here there can be either one, two, three or four -1's in the first half and there should be equal number of 1's in the second half. It should be done in combinations For example, if there are two -1's in the first half, they can be placed in 4 choose 2 = 6 ways and for each of them there will be 6 ways to place the 1's in the second half which has a total of 6*6 = 36 ways. i.e. 36 different values for B's if there are two -1's in the first half. The placement of 1's in the matrix A should also be the same way. The way I could think of doing this is to consider a valarray or something of that sort and make the matrices A and B but I don't know what to do.
Now for every A, I've to test it with every combinations of B to see if there exists
[A]^T * [X] = [B]^T
I'm trying to prove a result that I got I need to know if such an X would exist or not. I'm very confused on implementing this. Any suggestions are welcome. This would come under linear programming concept in math. I want it either in C++ or in Matlab. Any other languages are also acceptable but I'm familiar with only these two. Thanks in advance.
Update:
Here is my answer for this problem :
clear;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%# Generating all possible values of vector B
%# permutations using dec2bin (start from 17 since it's the first solution)
vectorB = str2double(num2cell(dec2bin(17:255)));
%# changing the sign in the first half, then check that the total is zero
vectorB(:,1:4) = - vectorB(:,1:4);
vectorB = vectorB(sum(vectorB,2)==0,:);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%# generate all possible variation of first/second halves
z = -[0 1 1; 1 0 1; 1 1 0; 1 1 1]; n = -sum(z,2);
h1 = {
[ ones(4,1) z(:,1:3)] ;
[z(:,1:1) ones(4,1) z(:,2:3)] ;
[z(:,1:2) ones(4,1) z(:,3:3)] ;
[z(:,1:3) ones(4,1) ] ;
};
h2 = arrayfun(#(i) unique(perms([zeros(1,4-i) ones(1,i)]),'rows'), (1:2)', ...
'UniformOutput',false);
%'# generate all possible variations of complete rows
rows = cell(4,1);
for r=1:4
rows{r} = cell2mat( arrayfun( ...
#(i) [ repmat(h1{r}(i,:),size(h2{n(i)-1},1),1) h2{n(i)-1} ], ...
(1:size(h1{r},1))', 'UniformOutput',false) );
end
%'# generate all possible matrices (pick one row from each to form the matrix)
sz = cellfun(#(M)1:size(M,1), rows, 'UniformOutput',false);
[X1 X2 X3 X4] = ndgrid(sz{:});
matrices = cat(3, ...
rows{1}(X1(:),:), ...
rows{2}(X2(:),:), ...
rows{3}(X3(:),:), ...
rows{4}(X4(:),:) );
matrices = permute(matrices, [3 2 1]); %# 4-by-8-by-104976
A = matrices;
clear matrices X1 X2 X3 X4 rows h1 h2 sz z n r
options = optimset('LargeScale','off','Display','off');
for i = 1:size(A,3),
for j = 1:size(vectorB,1),
X = linprog([],[],[],A(:,:,i)',vectorB(j,:)');
if(size(X,1)>0) %# To check that it's not an empty matrix
if((size(find(X < 0),1)== 0)) %# to check the condition X>=0
if (A(:,:,i)'* X == (vectorB(j,:)'))
X
end
end
end
end
end
I got it with the help of stackoverflow folks. The only problem is the linprog function throws a lot of exceptions in every iteration along with the answers produced. The exception is:
(1)Exiting due to infeasibility: an all-zero row in the constraint matrix does not have a zero in corresponding right-hand-side entry.
(2) Exiting: One or more of the residuals, duality gap, or total relative error has stalled: the primal appears to be infeasible (and the dual unbounded).(The dual residual < TolFun=1.00e-008.
What does this mean. How can I overcome this?
It is not clear from your question if you are familiar with system linear equations and their solution, or it is what you are trying to "invent". See also here for Matlab-specific explanation.
If you are familiar with that, you should be more clear in your question about what makes your problem different.