Checking if a list of arbitrary inequalities are satisfied - c++

I am writing some code in C++ and need to check whether or not a list of inequalities in two unknown variables are satisfied.
For example, one possible list could be P = Q, Q < S, P = S which should NOT be satisfied
Another example, P = Q, Q < S, R = P, S > R should be satisfied
I have thought long and hard how to do this but I cannot seem to find any method other then a long, tedious one which involves checking if every new inequality added satisfies all the previous ones.

This is more of an exercise in boolean logic than C++. If you know Python use it.
A faster way would be to construct an "ordering" (in the maths sense) one statement at a time. That is where you have a sequence, a b c suppose and a
Suppose now each thing in the sequence is a vector of things equal to each other. In the above example:
P=Q
Our ordering looks like:
[P,Q]
Next Q
So now as we have an ordering:
[P,Q],[S]
We know [P,Q]<[S]
So P=S is an obvious contradiction.

First, remove all equalities P = Q by replacing all occurrences of Q with P.
This reduces P = Q, Q < S, R = P, S > R to R < S, S > R.
Second, build a directed graph with the variables as vertices and an edge from P to Q if your list constains P < Q or Q > P.
Third, check if the graph contains cycles. The inequalities are satisfiable iff the graph contains no cycles.
You might want to google for 2-SAT, which is related to this problem.

Related

C++20 comparing two lazily sorted ranges

The question
I have two ranges, call them v,w that are sorted in a given fashion and can be compared (call the order relation T). I want to compare them lexicographically but after sorting them in a different way (call this other order relation S). For this I do not really need the ranges to be completely sorted: I only need to lazily evaluate the elements on the sorted vectors until I find a difference. For example if the maximum of v in this new order is larger than the maximum of w, then I need to only look once in the ordered vectors. In the worst case that v == w I'd look up in all elements.
I understand that C++20 std::ranges::views allows me to get a read only view of v and w that is lazily evaluated. Is it possible to get a custom sorted view that is still lazily evaluated? if I were able to define some pseudocode like
auto v_view_sorted_S = v | std::views::lazily_sort();
auto w_view_sorted_S = w | std::views::lazily_sort();
Then I could simply call std::ranges::lexicographical_compare(v_view_sorted_S, w_view_sorted_S).
How does one implement this?
Would simply calling std::ranges::sort(std::views::all(v)) work? in the sense that will it accept a view instead of an actual range and more importantly evaluate the view lazily? I get from the comments to the reply in this question that with some conditions std::ranges::sort can be applied to views, and even transformed ones. But I suspect that it sorts them at the call time, is that the case?
The case I want it used:
I am interested in any example but the very particular use case that I have is the following. It is irrelevant for the question, but helps putting this in context
The structures v and w are of the form
std::array<std::vector<unsigned int>,N> v;
Where N is a compile-time constant. Moreover, for each 0 <= i < N, v[i] is guaranteed to be non-increasing. The lexicographical order thus obtained for any two ordered arrays is what I called T above.
What I am interested is in comparing them by the following rule: given an entry a = v[i][j] and b = v[k][l] with 0 <= i,k < N and j,l >= 0. Then declare a > b if that relation holds as unsigned integers or if a == b as unsigned integers and i < k.
After ordering all entries of v and w with respect to this order, then I want to compare them lexicographically.
Example, if v = {{2,1,1}, {}, {3,1}}, w = {{2,1,0}, {2}, {3,0}} and z = {{2,1,0}, {3}, {2,0}}, then z > w > v.

Having an iterator as a sliding window of 3 elements that can overshoot bounds (possibly using Boost)

Having read this SO post and exploring Boost.Iterator, I want to see if I can make a sliding window of size 3 iterate through a single vector where the final iteration has an 'empty third element'.
Assuming that the vector size is >= 2, an example:
{a, b, c, d, e, f, g}
We will always start on index 1 because this algorithm I'm implementing requires a 'previous' element to be present and does not need to operate on the first element (so we would iterate from i = 1 while i < size()):
V
[a, b, c]
{a, b, c, d, e, f, g}
when I move to the next iteration, it would look like:
V
[b, c, d]
{a, b, c, d, e, f, g}
and upon reaching the last element in the iteration, it would have this:
V
[f, g, EMPTY]
{a, b, c, d, e, f, g}
What I want is to be able to grab the "prev" and check if "hasNext" and grab the next element if available. My goal is very clean modern C++ code, and it not doing bookkeeping of tracking pointers/references for three different elements makes the code a lot cleaner:
for (const auto& it : zippedIterator(dataVector)) {
someFunc(it.first, triplet.second);
if (someCondition(it.second) && hasThirdElement) {
anotherFunc(it.second, it.third)
}
}
I was trying to see if this is possible with boost's zip iterator, but I don't know if it allows me to overshoot the end and have some empty value.
I've thought of doing some hacky stuff like having a dummy final element, but then I have to document it and I'm trying to write clean code with zero hacky tricks.
I was also going to roll my own iterator but apparently std::iterator is deprecated.
I also don't want to create a copy of the underlying vector since this will be used in a tight loop that needs to be fast and copying everything would be very expensive for the underlying objects. It doesn't need to be extremely optimized, but copying the iterator values into a new array is out of the question.
If this were a matter of simply having a sized window into a range, then what you really want is to have a range that you can advance. In your case, that range is 3 elements long, but there's no reason that a general mechanism couldn't allow for a variable-sized range. It would just be a pair of iterators, such that you can ++ or -- both of them at the same time.
The problem you run into is that you want to manufacture an element if the subrange is off the end of the range. That complicates things; that would require proxy-iterators and so forth.
If you want a solution for your specific case (a 3-element sized range, where the last element can be manufactured if it's off the end of the main range), then you first need to decide if you want to have an actual type for this. That is, is it worth implementing a whole type, rather than a couple of one-off utility functions?
My way to handle this would be to redefine the problem. What you seem to have is a current element, just like any other iteration. But you want to be able to access the previous element. And you want to be able to peek ahead to the next element; if there is none, then you want to manufacture some default. So... perform iteration, but write a couple of utility functions that let you access what you need from the current element.
for(auto curr = ++dataVector.begin();
curr != dataVector.end();
++curr)
{
someFunc(prevElement(curr), *curr);
auto nextIt = curr + 1;
if(nextIt != dataVector.end() && someCondition(*curr))
anotherFunc(*curr, *nextIt)
}
prevElement is a simple function that accesses the element before the given iterator.
template<typename It>
//requires BidirectionalIterator<It>
decltype(auto) prevElement(It curr) {return *(--curr);}
If you want to have a function to check the next element and manufacture a value for it, that can be done too. This one has to return a prvalue of the element, since we may have to manufacture it:
template<typename It>
//requires ForwardIterator<It>
auto checkNextElement(It curr, It endIt)
{
++curr;
if(curr == endIt)
return std::iterator_traits<It>::value_type{};
return *curr;
}
Yes, this isn't all clever, with special range types and the like. But the stuff you're doing is hardly common, particularly the having to manufacture the next element as you do. By keeping things simple and obvious, you make it easy for someone to read your code without having to understand some specialized sub-range type.

Total merging search optimization

Assume there is a vector VA of size N, and each element is another vector of type T. There is an operation on type T and returning a new value of type T, i.e., bool merge(T a, T b, T &ret);. If a and c can be merged, then store the result in ret and return true; otherwise, return false. The merge operation is reflective and transitive.
A solution is found if either:
∃ x0, x1, ..., xN-1. merge(VA[0][x0], VA[1][x1], merge(VA[2][x2], ..., merge(VA[N-2][xN-2],VA[N-1][xN-1], ret)...));
any elements from N-1 (not N) sub-vectors can be merged (pick any N-1 with exactly one exception).
For example:
VA is of size 3. Element a can be merged with Element b with the result c. Element c can be merged with Element d with the result e.
VA[0] = {a}
VA[1] = {b, q}
VA[2] = {d, r}
All solutions in the above example are: {a,b}, {a,d}, {b,d}, {a,b,d}.
The task is to find all solution in the given vector VA.
My C++ code is:
void findAll(unsigned int step, unsigned int size, const T pUnifier, int hole_id) {
if(step == size) printOneResult(pUnifier);
else {
_path[step] = -1;
findAll(step + 1, pUnifier, step);
}
std::vector<T> vec = VA[step];
for(std::vector<T>::const_iterator it = vec.begin(); it < vec.end(); it++) {
T nextUnifier();
if( merge( *it, pUnifier, nextUnifier )) {
_path[lit_id] = it->getID();
findAll(step + 1, nextUnifier, hole_id);
}
}
}
The code contains recursive calls; however, it is not tail recursive. It is running slowly in practice. In reality, the size of VA is possibly hundreds and each sub-vector size is of hundreds, too. I'm wondering whether it can be optimized.
Thank you very much.
If I'm understanding your code correctly, you're performing a (recursive) brute-force search. This is not efficient, since you're given some information about your search space.
I think a good candidate here would be the A* algorithm. You could use the current greatest-chain size as the heuristic, or perhaps even the sum of the squares of the chain sizes.
To improve your code, as you use vectors, you should use the [] operator, with a int counter instead of simple iterators, that are much much slower.
You can improve it even more by minimising the function calls i either of your loops, like previously stacking the values you will use.
Since you didn't explained what really was a T_VEC, i coudln't not wrote the complete iterator-free version, but this should already be a great plus regarding speed.

How does the CYK algorithm work?

I have to check if a string can be derived from a given context free that is in Chomsky normal form. I'm using C++.
There is very nice pseudocode on the Wikipedia article covering the CYK algorithm, but I can't understand it very well.
Would someone be so kind to help me out by giving me another pseudocode for CYK algorithm, or maybe explain the one in the wiki article?
The CYK algorithm takes as input a CFG that's in Chomsky normal form. That means that every production either has the form
S → a, for some terminal a, or
S → AB, for some nonterminals A and B.
Now, imagine you have a string w and you want to see whether you can derive it from a grammar whose start symbol is S. There are two options:
If w is a single character long, then the only way to parse it would be to use a production of the form S → a for some character a. So see whether any of the single-character productions would match a.
If w is more than one character long, then the only way to parse it is to use a production of the form S → AB for some nonterminals A and B. That means that we need to divide the string w into two pieces x and y where A derives x and B derives y. One way to do that is to try all possible ways of splitting w into two pieces and to see if any of them work.
Notice that option (2) here ends up being a recursive parsing problem: to see whether you can derive w from S, see whether you can derive x from A and y from B.
With that insight, here's pseudocode for recursive function you can use to see whether a nonterminal S derives a string w:
bool canDerive(nonterminal S, string w) {
return canDeriveRec(S, w, 0, w.size());
}
/* Can you derive the substring [start, end) of w from S? */
bool canDeriveRec(nonterminal S, string w, int start, int end) {
/* Base case: Single characters */
if (end - start == 1) {
return whether there is a production S -> a, where a = w[start];
}
/* Recursive case: Try all possible splits */
for (each production S -> AB) {
for (int mid = start + 1; mid < end; mid++) {
if (canDeriveRec(A, w, start, mid) &&
canDeriveRec(B, w, mid, end)) {
return true;
}
}
}
return false;
}
This algorithm works correctly, but if you map out the shape of the recursion you'll find that
it makes a ton of redundant recursive calls, but
there aren't that many different possible recursive calls.
In fact, the number of distinct possible calls is O(n2 N), where n is the length of the input string (for each possible combination of a start and end index) and N is the number of nonterminals in the grammar. These observations suggest that this algorithm would benefit either from memoization or dynamic programming, depending on which approach you happen to think is nicer.
The CYK algorithm is what you get when you take the above recursive algorithm and memoize the result, or equivalently when you convert the above recursive algorithm into a dynamic programming problem.
There are O(n2 N) possible recursive calls. For each production tried, it does O(n) work. If there are P productions, on average, for a nonterminal, this means the overall runtime is O(n3 NP), which is O(n3) for a fixed grammar.

Comparing two vectors of maps

I've got two ways of fetching a bunch of data. The data is stored in a sorted vector<map<string, int> >.
I want to identify whether there are inconsistencies between the two vectors.
What I'm currently doing (pseudo-code):
for i in 0... min(length(vector1), length(vector2)):
for (k, v) in vector1[i]:
if v != vector2[i][k]:
// report that k is bad for index i,
// with vector1 having v, vector2 having vector2[i][k]
for i in 0... min(length(vector1), length(vector2)):
for (k, v) in vector2[i]:
if v != vector1[i][k]:
// report that k is bad for index i,
// with vector2 having v, vector1 having vector1[i][k]
This works in general, but breaks horribly if vector1 has a, b, c, d and vector2 has a, b, b1, c, d (it reports brokenness for b1, c, and d). I'm after an algorithm that tells me that there's an extra entry in vector2 compared to vector1.
I think I want to do something where when I encountered mismatches entries, I look at the next entries in the second vector, and if a match is found before the end of the second vector, store the index i of the entry found in the second vector, and move to matching the next entry in the first vector, beginning with vector2[i+1].
Is there a neater way of doing this? Some standard algorithm that I've not come across?
I'm working in C++, so C++ solutions are welcome, but solutions in any language or pseudo-code would also be great.
Example
Given the arbitrary map objects: a, b, c, d, e, f and g;
With vector1: a, b, d, e, f
and vector2: a, c, e, f
I want an algorithm that tells me either:
Extra b at index 1 of vector1, and vector2's c != vector1's d.
or (I'd view this as an effectively equivalent outcome)
vector1's b != vector2's c and extra d at index 2 of vector1
Edit
I ended up using std::set_difference, and then doing some matching on the diffs from both sets to work out which entries were similar but different, and which had entries completely absent from the other vector.
Something like the std::mismatch algorithm
You could also use std::set_difference
It sounds like you're looking for the diff algorithm. The idea is to identify the longest common subsequence of the two vectors (using map equality), then recurse down the non-common portions. Eventually you'll have an alternating list of vector sub-sequences that are identical, and sub-sequences that have no common elements. You can then easily produce whatever output you like from this.
Apply it to the two vectors, and there you go.
Note that since map comparison is expensive, if you can hash the maps (use a strong hash - collisions will result in incorrect output) and use the hashes for comparisons you'll save a lot of time.
Once you're down to the mismatched subsequences at the end, you'll have something like:
Input vectors: a b c d e f, a b c' d e f
Output:
COMMON a b
LEFT c
RIGHT c'
COMMON d e f
You can then individually compare the maps c and c' to figure out how they differ.
If you have a mutation and insertion next to each other, it gets more complex:
Input vectors: a b V W d e f, a b X Y d e f
Output:
COMMON a b
LEFT V W
RIGHT X Y
COMMON d e f
Determining whether to match V and W against X or Y (or not at all) is something you'll have to come up with a heuristic for.
Of course, if you don't care about how the content of the maps differ, then you can stop here, and you have the output you want.
What exactly are you trying to achieve? Could you please define precisely what output you expect in terms of the input? Your pseudo code compares maps at the vector index. If that is not the correct semantics, then what is?
Can you associate with each map some kind of checksum (or Blumen filter) - that at single check you could be able to decide if comparison has a sense.
In your example, note that is not possible to differentiate between
Extra b at index 1 of vector1, and
vector2's c != vector1's d.
and
Extra b at index 1 of vector 1, extra
d at index 2 of v1, and extra c at 1
in v2
because it is not clear that "c" shoud be compared to "d", it could be compared to "b" either. I assume the vectors are not sorted, because std::map doesn't provide a relational operator. Rather are the maps, which is as far as I see completly irrelevant ;-)
So your example is slightly misreading. It could even be
Compare
b f e a d
with
a c f e
You can check each element of the first vector against each element of the second vector.
This has quadratic runtime.
for i in 0... length(vector1):
foundmatch = false;
for j in 0... length(vector2):
mismatch = false;
for (k, v) in vector1[i]:
if v != vector2[j][k]:
mismatch = true;
break; // no need to compare against the remaining keys.
if (!mismatch) // found matching element j in vector2 for element i in vector1
foundmatch = true;
break; // no need to compare against the remaining elements in vector2
if (foundmatch)
continue;
else
// report that vector1[i] has no matching element in vector2[]
// "extra b at i"
If you want the find the missing elements, just swap vector1 and vector2.
If you want to check in a element in vector2 mismatches to a element in vector1 in only a single key, you have to add additional code around "no need to compare against the remainig keys".