how to implement a 'nested' cost function in Gecode? - c++

I am new to gecode and constraint programming in general.
So far, I haven't had much trouble picking up gecode, it's great. But I was wondering what is the best way to perform a "nested" cost function. Specifically, I am looking to minimize X, but within the space of solutions for which X is equal, prefer solutions which minimize Y? I could probably hack it by defining a cost function that looks like X*large_number+Y, but I'd prefer to do this properly if there's a good solution.
If anyone can point me to explain how to implement this in Gecode, that would be really helpful. Thanks!

You can define any kind of optimization criteria using the constrain member in a space in Gecode. See Section 2.5 in Modeling and Programming with Gecode for an example. In your case, the straight forward way would be to add a constrain member that adds a lexicographic constraint between the previous best solutions answer and the current space.
That being said, in general optimizing based on a lexicographic order can be wasteful (too much searching). It may often be better to first run a search optimizing the first component (X in your case). After that, re-run the search with the first components value fixed (X set to best possible value), and optimize the second value (Y in your case). Iterate as needed for all elements in the cost.

Related

How can I best use my objective function to quickly find *a* feasible solution (Gurobi)?

I have a working ILP Gurobi model (exclusively binary variables). Reducing runtime and finding a feasible solution is of far more value to me than the optimal solution. Reducing my SolutionLimit to 1 does help. I realized that my objective function is summing up hundreds of thousands of variables together. If I don't truly care about optimality, can I somehow simplify my objective function to reduce the burden on the solver?
Here is my current objective function:
m.setObjective(quicksum(h[x,y,c,p,t] + v[x,y,c,p,t]
for x in range(0,Nx)
for y in range(0,Ny)
for c in range(0,C)
for p in range(0,P)
for t in range(0,T)), GRB.MINIMIZE)
I don't want to nitpick but there is no such thing as a "more optimal" solution - "optimal" is already the superlative. In case you are really only looking for a feasible solution without regard for the objective function, you should follow Erwin's advice and don't set an objective function at all. It's hard to believe, though, that your current objective function is completely meaningless, so a better approach is probably to reduce the objective function to include only a few variables and also to set a higher MIPGap to terminate the solve earlier.

Gecode vs. Z3 for Constrained Randomization

I'm looking for a C++-based alternative to the SystemVerilog language.
While I doubt anything out there can match the simplicity and flexibility of the SystemVerilog constraint language, I have settled on using either Z3 or Gecode for what I'm working on, primarily because they're both under the MIT license.
What I'm looking for is:
Support for variable-sized bit vectors AND bit vector arithmetic logic operations. For example:
bit_vector a<30>;
bit_vector b<30>;
constraint {
a == (b << 2);
a == (b * 2);
b < a;
}
The problem with Gecode, as far as I can tell, is that it does not provide bit vectors right out of the box. However, its programming model seems a bit simpler, and it does provide a means for one to create their own types of variables. So I could perhaps creates some kind of wrapper around the C++ bitset, similar to how IntVar wraps around 32-bit integers. However, that would lack the ability to perform multiplication-based constrains, since C++ bitsets don't support such operations.
Z3 does provide bit vectors right out of the box, but I'm not sure how it would handle trying to set constraints on, for example, 128-bit vectors. I'm also unsure how I can specify that I want to produce a variety of randomized variables that satisfy a constraint when possible. With Gecode, it's much clearer given how thorough its documentation is.
A simplistic constraint programming model, close or similar to SystemVerilog. For example, a language where I only need to type (x == y + z) instead of something like EQ(x, y + z). As far as I can tell, both APIs provide such a simple programming model.
A means of performing constrained randomization, for the sake of producing random stimulus. As in, I can provide some random seed that, depending on the constraints, result in an answer that may differ from the previous answer. Similar to how SystemVerilog randomize calls may produce new random results. Gecode seems to support the use of random seeds. Z3, it's much less clear.
Support for weighted distribution. Gecode appears to support this via weighted sets. I imagine I can establish a relationship between certain conditions and boolean variables, and then add weights to those variables. Z3 appears to be more flexible in that you can assign weights to expressions, via the optimize class.
At the moment, I'm undecided, because Z3 lacks in documentation what Gecode lacks in out-of-the-box variable types. I'm wondering if anyone has any prior experience using either tool to achieve what SystemVerilog could. I'd like to hear any suggestions for any other API under a flexible license as well.
While z3 (or any SMT solver) can handle all of these, getting a nice sampling of satisfying assignments would be rather difficult to control. SMT solvers are optimized for just giving you a model, and they don't have much in terms of how you want to sample the solution space.
Incidentally, this is an active research area in SMT solving. Here's a paper that appeared only 6 weeks ago on this very topic: https://ieeexplore.ieee.org/document/8894251
So, I'd say if support for "good sampling" is your primary motivation, using an SMT solver is probably not the best choice. If your goal is to find satisfying assumptions for bit-vectors expressed conveniently (there are high level APIs in any language you can imagine these days), then z3 would be an extremely fine choice.
From your description, good sampling sounds like the primary motivation though, and for that SMT solvers are probably not that great. At least not for the time being.

Getting all solutions in Google or-tools

I have a linear problem of finding all solutions that meet all constraints.
For example my variables are = [0.323, 0.123, 1.32, 6.3...]
Is it possible to get for example top 100 solutions sorted by fitness(maximization/minimization) function?
In a continuous LP enumerating different solutions is a difficult concept. E.g. consider max x, s.t. x <= 1. Obviously x=1, x=0.99999 are solutions and so are the infinite number of solutions in between. We could enumerate "corner solutions" (or basic solutions). See here for an example. Such a scheme could be adapted to find the first 100 different corner points sorted by the objective. For models with discrete variables, many constraint programming solvers will give you the possibility to find many solutions.
If you can define a fitness function as you suggested, then you might first want to solve the LP that maximizes this function. Afterwards you can include an objective cutoff that forces your second solution to be slightly worse than the first. You can implement this by introducing a cut that is your objective function with the right hand side of optimal value - epsilon.
Of course, this will not give you all (basic) solutions, but you might discover which variables are always at the same value or how much variance there is between the different solutions.

Compare two QAbstractItemModels

I'm trying to figure out an efficient algorithm that takes in two QAbstractItemModels (trees) (A,B) and computes the differences between them, such that I get a list of Items that are not present in A (but are in B - added), or items that have been modified / deleted.
The only current way I can think of is doing a Breadth search of A for every item item in B. But this doesn't seem very efficient. Any ideas are welcome.
Have you tried using magic?
Seriously though, this is a very broad question, especially if we consider the fact it is an QAbstractItemModels and not a QAbstractListModel. For a list it would be much simpler, but an abstract item model implements a tree structure so there are a lot of variables.
do you check for total item count
do you check for item count per level
do you check if item is contained in both models
if so, is it contained at the same level
if so, is it contained at the same index
is the item in its original state or has it been modified
You need to make all those considerations and come up with an efficient solution. And don't expect it will as simple as a "by the book algorithm". Good news, since you are dealing with isolated items, it will be easier than trying to do that for text, and in the case of the latter, you can't hope to get anywhere nearly as concise as with isolated items. I've had my fair share of absurdly mindless github diff results.
And just in case that's your actual goal, it will be much easier to achieve by tracking the history of the derived data set than doing a blind comparison. Tracking history is much easier if you want to establish what is added, what is deleted, what is moved and what is modified. Because it will consider the actual event flow rather than just the end result comparison. Especially if you don't have any persistent ID scheme implemented. Is there a way to tell if item X has been deleted or moved to a new level/index and modified and stuff like that.
Also, worry about efficiency only after you have empirically established a performance issue. Some algorithms may seem overly complex, but modern machines are overly fast, and unless you are running that in a tight loop you shouldn't really worry about it. In the end, it doesn't boil down to how complex it is, it boils down to whether it is fast enough or not.

When to use which Sorting algorithm and when definitely shouldn't

We see lot of sorting techniques like Merge, quick, Heap. Could you help me decide which of these sorting techniques to be used in which kind of environment(as in the problem)? When should we use which of these sorting algorithms and where not(their disadvantages in time and space)?
I am expecting answer something in this form:
a) We would use Merge sort when... we should definitely not use Merge Sort when...
b) We would use Quick sort when... we should definitely not use quick Sort when...
There are a few basic parameters that characterize the behavior of each sorting algorithm:
average case computational complexity
worst case computational complexity
memory requirements
stability (i.e. is it a stable sort or not?)
All of these are widely documented for all commonly used sorts, and this is all the information one needs to provide an answer in the format that you want. However, since even four parameters for each sort make for a lot of things -- not all of which will be relevant -- to consider, it isn't a very good idea to try and give such a "scripted" answer. Furthermore, there are even more advanced concepts that could come into consideration (such as behavior when run on almost-sorted or reverse-sorted data, cache performance, resistance to maliciously constructed input), making such an answer even more lengthy and error-prone.
I suggest that you spend some time familiarizing yourself with the four basic concepts mentioned above, perhaps by visualizing how each type of sort works on simple input and reading an introductory text on sorting algorithms. Do this and soon enough you will be able to answer such questions yourself.
For starters, take a look at this comparison table on wikipedia, the comparison criteria will give you clues on what to look for on an algorithm and its possible tradeoffs.