best way to branch and bound in CPLEX C++ Concert - c++

I want to branch and bound over an ILP in CPLEX, where I have an 1 dimensional Variable X of the size n, and a 2 dimensional BINARY Variable Y, so Y[i][j] can either be 0 or 1.
now I want to branch and bound over the Y variables. So its going to be a binary branch and bound.
first I want to get a lower bound and an upper bound from a heuristic.
then I want to branch over the Y[1][2] variable, so add a node for Y[1][2] = 0 and Y[1][2] = 1. after each step I want to solve the relaxed LP and compare for upper bounds etc. and then repeat the steps, until I get a optimal solution.
my question is, what is the best way to implement this? I don't have much experience with CPLEX, so:
how can I set the value for Y[1][2] = 0 or 1
what's the most efficient way to branch (should I make a class, function etc...)
how is the best solution going to be saved
I would be very thankful if somebody here has experience in implementing the branch and bound method in CPLEX, concert and could answer my questions.
I would also appreciate an example code.
thank you very much in advance

Related

What's the most efficient way to store a subset of column indices of big matrix and in C++?

I am working with a very big matrix X (say, 1,000-by-1,000,000). My algorithm goes like following:
Scan the columns of X one by one, based on some filtering rules, to identify only a subset of columns that are needed. Denote the subset of indices of columns be S. Its size depends on the filter, so is unknown before computation and will change if the filtering rules are different.
Loop over S, do some computation with a column x_i if i is in S. This step needs to be parallelized with openMP.
Repeat 1 and 2 for 100 times with changed filtering rules, defined by a parameter.
I am wondering what the best way is to implement this procedure in C++. Here are two ways I can think of:
(a) Use a 0-1 array (with length 1,000,000) to indicate needed columns for Step 1 above; then in Step 2 loop over 1 to 1,000,000, use if-else to check indicator and do computation if indicator is 1 for that column;
(b) Use std::vector for S and push_back the column index if identified as needed; then only loop over S, each time extract column index from S and then do computation. (I thought about using this way, but it's said push_back is expensive if just storing integers.)
Since my algorithm is very time-consuming, I assume a little time saving in the basic step would mean a lot overall. So my question is, should I try (a) or (b) or other even better way for better performance (and for working with openMP)?
Any suggestions/comments for achieving better speedup are very appreciated. Thank you very much!
To me, it seems that "step #1 really does not matter much." (At the end of the day, you're going to wind up with: "a set of columns, however represented.")
To me, what's really going to matter is: "just what's gonna happen when you unleash ("parallelized ...") step #2.
"An array of 'ones and zeros,'" however large, should be fairly simple for parallelization, while a more-'advanced' data structure might well, in this case, "just get in the way."
"One thousand mega-bits, these days?" Sure. Done. No problem. ("And if not, a simple array of bit-sets.") However-many simultaneously executing entities should be able to navigate such a data structure, in parallel, with a minimum of conflict . . . Therefore, to my gut, "big bit-sets win."
I think you will find std::vector easier to use. Regarding push_back, the cost is when the vector reallocates (and maybe copies) the data. To avoid that (if it matters), you could set vector::capacity to 1,000,000. Your vector is then 8 MB, insignificant compared to your problem size. It's only 1 order magnitude bigger than a bitmap would be, and a lot simpler to deal with: If we call your vector S and the nth interesting column i, then your column access is just x[S[i]].
(Based on my gut feeling) I'd probably go for pushing back into a vector, but the answer is quite simple: Measure both methods (they are both trivial to implement). Most likely you won't see a noticeable difference.

CPLEX MIP early termination with relative gap, getBestObjValue vs getObjValue

I'm using C++ to model a (maximization) MIP with CPLEX, and I specify a relative gap using
cplex.setParam(IloCplex::EpGap, gap);
I'm puzzled by the difference between
cplex.getBestObjValue();
and
cplex.getObjValue();
in case of early termination because of the gap.
If I understand correctly, the value from getBestObjValue() will always correspond to an integer feasible solution, and a lower bound to the optimal value. On the other hand, the value from getObjValue() (may? will always?) correspond to a non-feasible solution and is an upper bound to the optimal value. Am I understanding this correctly?
I also have another question: the value returned by getBestObjValue() is, in the case of maximization problems, 'the maximum objective function value of all remaining unexplored nodes' (from the CPLEX docs). Is there a a way to query the objective values of these unexplored nodes? I'm asking because I would like to get the minimal value that satisfies my relative gap, not the maximum.
According to the manual:
Cplex.GetBestObjValue Method:
It is computed for a minimization problem as the minimum objective function value of all remaining unexplored nodes. Similarly, it is computed for a maximization problem as the maximum objective function value of all remaining unexplored nodes.
For a regular MIP optimization, this value is also the best known bound on the optimal solution value of the MIP problem. In fact, when a problem has been solved to optimality, this value matches the optimal solution value.
It corresponds to an upper bound (when maximising) of the objective value, there is a gap when you stop the solver before reaching optimality. In MIP, there is branch and bound tree behind, as more nodes are explored, the upper bound decreases. There might or might not be any solution matching the upper bound when you stop by epgap.
Therefore your assumption below is wrong:
If I understand correctly, the value from getBestObjValue() will always correspond to an integer feasible solution.
GetObjValue() on the other hand is the objective value of the current best solution (corresponding to a found feasible solution). It is a lower bound, this is the value that you want to use in your second question.

What does the 'lower bound' in circulation problems mean?

Question: Circulation problems allow you to have both a lower and an upper bound on the flow through a particular arc. The upper bound I understand (like pipes, there's only so much stuff that can go through). However, I'm having a difficult time understanding the lower bound idea. What does it mean? Will an algorithm for solving the problem...
try to make sure every arc with a lower bound will get at least that much flow, failing completely if it can't find a way?
simply disregard the arc if the lower bound can't be met? This would make more sense to me, but would mean there could be arcs with a flow of 0 in the resulting graph, i.e.
Context: I'm trying to find a way to quickly schedule a set of events, which each have a length and a set of possible times they can be scheduled at. I'm trying to reduce this problem to a circulation problem, for which efficient algorithms exist.
I put every event in a directed graph as a node, and supply it with the amount of time slots it should fill. Then I add all the possible times as nodes as well, and finally all the time slots, like this (all arcs point to the right):
The first two events have a single possible time and a length of 1, and the last event has a length of 4 and two possible times.
Does this graph make sense? More specifically, will the amount of time slots that get 'filled' be 2 (only the 'easy' ones) or six, like in the picture?
(I'm using a push-relabel algorithm from the LEMON library if that makes any difference.)
Regarding the general circulation problem:
I agree with #Helen; even though it may not be as intuitive to conceive of a practical use of a lower bound, it is a constraint that must be met. I don't believe you would be able to disregard this constraint, even when that flow is zero.
The flow = 0 case appeals to the more intuitive max flow problem (as pointed out by #KillianDS). In that case, if the flow between a pair of nodes is zero, then they cannot affect the "conservation of flow sum":
When no lower bound is given then (assuming flows are non-negative) a zero flow cannot influence the result, because
It cannot introduce a violation to the constraints
It cannot influence the sum (because it adds a zero term).
A practical example of a minimum flow could exist because of some external constraint (an associated problem requires at least X water go through a certain pipe, as pointed out by #Helen). Lower bound constraints could also arise from an equivalent dual problem, which minimizes the flow such that certain edges have lower bound (and finds an optimum equivalent to a maximization problem with an upper bound).
For your specific problem:
It seems like you're trying to get as many events done in a fixed set of time slots (where no two events can overlap in a time slot).
Consider the sets of time slots that could be assigned to a given event:
E1 -- { 9:10 }
E2 -- { 9:00 }
E3 -- { 9:20, 9:30, 9:40, 9:50 }
E3 -- { 9:00, 9:10, 9:20, 9:30 }
So you want to maximize the number of task assignments (i.e. events incident to edges that are turned "on") s.t. the resulting sets are pairwise disjoint (i.e. none of the assigned time slots overlap).
I believe this is NP-Hard because if you could solve this, you could use it to solve the maximal set packing problem (i.e. maximal set packing reduces to this). Your problem can be solved with integer linear programming, but in practice these problems can also be solved very well with greedy methods / branch and bound.
For instance, in your example problem. event E1 "conflicts" with E3 and E2 conflicts with E3. If E1 is assigned (there is only one option), then there is only one remaining possible assignment of E3 (the later assignment). If this assignment is taken for E3, then there is only one remaining assignment for E2. Furthermore, disjoint subgraphs (sets of events that cannot possibly conflict over resources) can be solved separately.
If it were me, I would start with a very simple greedy solution (assign tasks with fewer possible "slots" first), and then use that as the seed for a branch and bound solver (if the greedy solution found 4 task assignments, then bound if you recursive subtree of assignments cannot exceed 3). You could even squeeze out some extra performance by creating the graph of pairwise intersections between the sets and only informing the adjacent sets when an assignment is made. You can also update your best number of assignments as you continue the branch and bound (I think this is normal), so if you get lucky early, you converge quickly.
I've used this same idea to find the smallest set of proteins that would explain a set of identified peptides (protein pieces), and found it to be more than enough for practical problems. It's a very similar problem.
If you need bleeding edge performance:
When rephrased, integer linear programming can do nearly any variant of this problem that you'd like. Of course, in very bad cases it may be slow (in practice, it's probably going to work for you, especially if your graph is not very densely connected). If it doesn't, regular linear programming relaxations approximate the solution to the ILP and are generally quite good for this sort of problem.
Hope this helps.
The lower bound on the flow of an arc is a hard constraint. If the constraints can't be met, then the algorithm fails. In your case, they definitely can't be met.
Your problem can not be modeled with a pure network-flow model even with lower bounds. You are trying to get constraint that a flow is either 0 or at least some lower bound. That requires integer variables. However, the LEMON package does have an interface where you can add integer constraints. The flow out of each of the first layer of arcs must be either 0 or n where n is the number of required time-slots or you could say that at most one arc out of each "event" has nonzero flow.
Your "disjunction" constraint,
can be modeled as
f >= y * lower
f <= y * upper
with y restricted to being 0 or 1. If y is 0, then f can only be 0. If y is 1, the f can be any value between lower and upper. The mixed-integer programming algorithms will orders of magnitude slower than the network-flow algorithms, but they will model your problem.

Finding all permutations that match a set of rules

I am given N numbers and for them apply M rules about their order. The rules are represented in a pairs of indexes and every pair (A, B) is telling that the number with index A (A-th number) must be AFTER the B-th number - it doesn't have to be next to him.
Ex: N = 4
1 2 3 4
M = 2
3 2
3 1
Output: 1234, 4213, 4123, 2134, 2143, 2413, 1423 ...Maybe there are even more:)
The algorithm should give me all the permutations available that don't break the rules, like in the example - 3 must always be after 2 and after 1.
I tried bruteforcing but it didn't work (although bruteforce should work in here, N is in the range (1,8). )
Any ideas ?
Just as a hint.
You can treat your set of rules as a graph. Each index is a vertex, each rule is a directed edge.
Any proper ordering of the numbers (i.e. a permutation that satisfies the rules) corresponds to so called topological ordering of the above graph. In order to generate all valid orderings of your numbers you need to generate all possible topological orderings of that graph.
P.S. The first algorithm for topological ordering given at the linked Wikipedia page already allows for a fairly straightforward solution that would enumerate all valid permutations. It will take some effort and some care to implement, but it is not rocket science.
Brute forcing would be going through every permutation, which is O(N!), and for each permutation simply looping through every rule to confirm that they aplpy, which is O(M). This ends up O(N!M) which is kind of ridiculous, but it shouldn't "not work" for such a small set.
Honestly, your best bet is to go back and get the brute force solution working. Once that is done (and if you still have time, etc) you can look for a better algorithm.
EDIT to the down voter. The student is (should be) trying to get his homework done on time. By the sounds of it, his homework is a programming exercise where a brute-force solution would be adequate. Helping him to figure out an efficient algorithm is not addressing his REAL problem.
In this case he has tried the simple brute-force approach (which everyone agrees ought to work for small N values) and given up on it prematurely to try something that is probably more difficult. Any experienced developer will tell you that this is a bad idea. The student needs and deserves to be told so, and if he is sensible he will pay attention. But obviously, it his choice ...

how to create a 20000*20000 matrix in C++

I try to calculate a problem with 20000 points, so there is a distance matrix with 20000*20000 elements, how can I store this matrix in C++? I use Visual Studio 2008, on a computer with 4 GB of RAM. Any suggestion will be appreciated.
A sparse matrix may be what you looking for. Many problems don't have values in every cell of a matrix. SparseLib++ is a library which allows for effecient matrix operations.
Avoid the brute force approach you're contemplating and try to envision a solution that involves populating a single 20000 element list, rather than an array that covers every possible permutation.
For starters, consider the following simplistic approach which you may be able to improve upon, given the specifics of your problem:
int bestResult = -1; // some invalid value
int bestInner;
int bestOuter;
for ( int outer = 0; outer < MAX; outer++ )
{
for ( int inner = 0; inner < MAX; inner++ )
{
int candidateResult = SomeFunction( list[ inner ], list[ outer ] );
if ( candidateResult > bestResult )
{
bestResult = candidateResult;
bestInner = inner;
bestOuter = outer;
}
}
}
You can represent your matrix as a single large array. Whether it's a good idea to do so is for you to determine.
If you need four bytes per cell, your matrix is only 4*20000*20000, that is, 1.6GB. Any platform should give you that much memory for a single process. Windows gives you 2GiB by default for 32-bit processes -- and you can play with the linker options if you need more. All 32-bit unices I tried gave you more than 2.5GiB.
Is there a reason you need the matrix in memory?
Depending on the complexity of calculations you need to perform you could simply use a function that calculates your distances on the fly. This could even be faster than precalculating ever single distance value if you would only use some of them.
Without more references to the problem at hand (and the use of the matrix), you are going to get a lot of answers... so indulge me.
The classic approach here would be to go with a sparse matrix, however the default value would probably be something like 'not computed', which would require special handling.
Perhaps that you could use a caching approach instead.
Apparently I would say that you would like to avoid recomputing the distances on and on and so you'd like to keep them in this huge matrix. However note that you can always recompute them. In general, I would say that trying to store values that can be recomputed for a speed-off is really what caching is about.
So i would suggest using a distance class that abstract the caching for you.
The basic idea is simple:
When you request a distance, either you already computed it, or not
If computed, return it immediately
If not computed, compute it and store it
If the cache is full, delete some elements to make room
The practice is a bit more complicated, of course, especially for efficiency and because of the limited size which requires an algorithm for the selection of those elements etc...
So before we delve in the technical implementation, just tell me if that's what you're looking for.
Your computer should be able to handle 1.6 GB of data (assuming 32bit)
size_t n = 20000;
typedef long dist_type; // 32 bit
std::vector <dist_type> matrix(n*n);
And then use:
dist_type value = matrix[n * y + x];
You can (by using small datatypes), but you probably don't want to.
You are better off using a quad tree (if you need to find the nearest N matches), or a grid of lists (if you want to find all points within R).
In physics, you can just approximate distant points with a field, or a representative amalgamation of points.
There's always a solution. What's your problem?
Man you should avoid the n² problem...
Put your 20 000 points into a voxel grid.
Finding closest pair of points should then be something like n log n.
As stated by other answers, you should try hard to either use sparse matrix or come up with a different algorithm that doesn't need to have all the data at once in the matrix.
If you really need it, maybe a library like stxxl might be useful, since it's specially designed for huge datasets. It handles the swapping for you almost transparently.
Thanks a lot for your answers. What I am doing is to solve a vehicle routing problem with about 20000 nodes. I need one matrix for distance, one matrix for a neighbor list (for each node, list all other nodes according to the distance). This list will be used very often to find who can be some candidates. I guess sometimes distances matrix can be ommited if we can calculate when we need. But the neighbor list is not convenient to create every time. the list data type could be int.
To mgb:
how much can a 64 bit windows system help this situation?