Counting how many decision variables are equal - c++

I'm a beginner user of Google OR-Tools, especially the CP-SAT. I'm using version 9.3, and I'm interested in the C++ version.
I'm modeling a problem where I need to count how many pairs of decision variables have the same (assigned) value. So, let's suppose I have a set of integer variables like this:
std::vector<IntVar> my_vars;
I also have a set of pairs like this:
std::vector<std::pair<size_t, size_t>> my_pairs;
Assume that all bounds are valid, size, etc, are valid. Now, I want to compute how many of these pairs have the same value. Using IBM Ilog Concert, I can do it very straightforward using:
// Using Ilog Concert technology.
IloIntVar count(env, 0, MY_UPPER_BOUND);
IloIntExpr expr_count(env);
for(const auto& [u, v] : my_pairs) {
expr_count += (my_vars[u] == my_vars[v]);
}
model.add(count == expr_count);
Here, count is a decision variable that holds how many pairs have the same value in a given solution. The expression is a sum of boolean values comparing the actual decision variable's values, not the variable objects themselves (i.e., is the object representing variable u is the same object representing variable v).
Using OR-Tools, the equality operator ==, compares whether the variable objects (or representation of them) are equal, not the decision variable values. So, the following fails by generating an empty expression:
// Using Google Or-Tools CP-SAT.
IntVar count = cp_model
.NewIntVar(Domain(0, my_pairs.size()))
.WithName("count");
LinearExpr expr_count;
for(const auto& [u, v] : my_pairs) {
expr_count += (my_vars[u] == my_vars[v]);
}
cp_model.AddEquality(count, expr_count);
Note that, according to Google OR-Tools code (here), we have that:
class IntVar {
//...
bool operator==(const IntVar& other) const {
return other.builder_ == builder_ && other.index_ == index_;
}
//...
};
i.e., comparing if the variables are the same, but not the value assigned to them. Therefore, we cannot compare decision variables directly using CP-SAT, and we need to recur to another method.
Obviously, I can change the model using some big-M notation and linearize such expressions. However, can I do count without to recur to "remodeling"? I.e., is there a construct I can use "more or less" easily so that I address such cases?
I must mention while I only depict one case here, I have quite a few counting variables of several sets like that. So, remodeling using big-M will be a big headache. I would prefer a simpler and straightforward approach like Ilog Concert.
(Update) Little extension
Now, I want do the same but comparing decision variables with scalars. For example:
std::vector<int> my_scalars;
for(size_t i = 0; i < my_scalars.size(); ++i) {
expr_count += (my_vars[i] == my_scalars[i]);
}
While this can be done using Ilog, it even did not compile on OR-Tools.
THanks,
Carlos

here is a tentative code:
IntVar count = model.NewIntVar(0, MY_UPPER_BOUND);
LinearExpr expr_count;
for(const auto& [u, v] : my_pairs) {
BoolVar is_equal = model.NewBoolVar();
model.AddEquality(my_vars[u], my_vars[v]).OnlyEnforceIf(is_equal);
model.AddNotEqual(my_vars[u], my_vars[v]).OnlyEnforceIf(is_equal.Not());
expr_count += is_equal;
}
model.AddEquality(expr_count, count);

With help of #sascha and #Laurent, my solution is this one:
vector<BoolVar> is_equal;
is_equal.reserve(my_pairs.size());
for(const auto& [u, v] : my_pairs) {
is_remainder_equal.push_back(cp_model.NewBoolVar());
cp_model
.AddEquality(my_vars[u], my_vars[v])
.OnlyEnforceIf(is_equal.back());
cp_model
.AddNotEqual(my_vars[u], my_vars[v])
.OnlyEnforceIf(Not(is_equal.back()));
}
cp_model.AddEquality(LinearExpr::Sum(is_equal), count);
It is the same as #Laurent in the very end, but I save the boolean vars for late use.
For scalars, it looks like I don't need to make a constant, just compare directly with the expression.
Thanks, #Laurent and #sascha. You guys were very helpful.

Related

Is it possible / advisable to return a range?

I'm using the ranges library to help filer data in my classes, like this:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(v) {}
std::vector<int> getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
return std::vector<int>(evens.begin(), evens.end());
}
private:
std::vector<int> vec;
};
In this case, a new vector is constructed in the getEvents() function. To save on this overhead, I'm wondering if it is possible / advisable to return the range directly from the function?
class MyClass
{
public:
using RangeReturnType = ???;
MyClass(std::vector<int> v) : vec(v) {}
RangeReturnType getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
// ...
return evens;
}
private:
std::vector<int> vec;
};
If it is possible, are there any lifetime considerations that I need to take into account?
I am also interested to know if it is possible / advisable to pass a range in as an argument, or to store it as a member variable. Or is the ranges library more intended for use within the scope of a single function?
This was asked in op's comment section, but I think I will respond it in the answer section:
The Ranges library seems promising, but I'm a little apprehensive about this returning auto.
Remember that even with the addition of auto, C++ is a strongly typed language. In your case, since you are returning evens, then the return type will be the same type of evens. (technically it will be the value type of evens, but evens was a value type anyways)
In fact, you probably really don't want to type out the return type manually: std::ranges::filter_view<std::ranges::ref_view<const std::vector<int>>, MyClass::getEvens() const::<decltype([](int i) {return ! (i % 2);})>> (141 characters)
As mentioned by #Caleth in the comment, in fact, this wouldn't work either as evens was a lambda defined inside the function, and the type of two different lambdas will be different even if they were basically the same, so there's literally no way of getting the full return type here.
While there might be debates on whether to use auto or not in different cases, but I believe most people would just use auto here. Plus your evens was declared with auto too, typing the type out would just make it less readable here.
So what are my options if I want to access a subset (for instance even numbers)? Are there any other approaches I should be considering, with or without the Ranges library?
Depends on how you would access the returned data and the type of the data, you might consider returning std::vector<T*>.
views are really supposed to be viewed from start to end. While you could use views::drop and views::take to limit to a single element, it doesn't provide a subscript operator (yet).
There will also be computational differences. vector need to be computed beforehand, where views are computed while iterating. So when you do:
for(auto i : myObject.getEven())
{
std::cout << i;
}
Under the hood, it is basically doing:
for(auto i : myObject.vec)
{
if(!(i % 2)) std::cout << i;
}
Depends on the amount of data, and the complexity of computations, views might be a lot faster, or about the same as the vector method. Plus you can easily apply multiple filters on the same range without iterating through the data multiple times.
In the end, you can always store the view in a vector:
std::vector<int> vec2(evens.begin(), evens.end());
So my suggestions is, if you have the ranges library, then you should use it.
If not, then vector<T>, vector<T*>, vector<index> depending on the size and copiability of T.
There's no restrictions on the usage of components of the STL in the standard. Of course, there are best practices (eg, string_view instead of string const &).
In this case, I can foresee no problems with handling the view return type directly. That said, the best practices are yet to be decided on since the standard is so new and no compiler has a complete implementation yet.
You're fine to go with the following, in my opinion:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(std::move(v)) {}
auto getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
private:
std::vector<int> vec;
};
As you can see here, a range is just something on which you can call begin and end. Nothing more than that.
For instance, you can use the result of begin(range), which is an iterator, to traverse the range, using the ++ operator to advance it.
In general, looking back at the concept I linked above, you can use a range whenever the conext code only requires to be able to call begin and end on it.
Whether this is advisable or enough depends on what you need to do with it. Clearly, if your intention is to pass evens to a function which expects a std::vector (for instance it's a function you cannot change, and it calls .push_back on the entity we are talking about), you clearly have to make a std::vector out of filter's output, which I'd do via
auto evens = vec | ranges::views::filter(whatever) | ranges::to_vector;
but if all the function which you pass evens to does is to loop on it, then
return vec | ranges::views::filter(whatever);
is just fine.
As regards life time considerations, a view is to a range of values what a pointer is to the pointed-to entity: if the latter is destroied, the former will be dangling, and making improper use of it will be undefined behavior. This is an erroneous program:
#include <iostream>
#include <range/v3/view/filter.hpp>
#include <string>
using namespace ranges;
using namespace ranges::views;
auto f() {
// a local vector here
std::vector<std::string> vec{"zero","one","two","three","four","five"};
// return a view on the local vecotor
return vec | filter([](auto){ return true; });
} // vec is gone ---> the view returned is dangling
int main()
{
// the following throws std::bad_alloc for me
for (auto i : f()) {
std::cout << i << std::endl;
}
}
You can use ranges::any_view as a type erasure mechanism for any range or combination of ranges.
ranges::any_view<int> getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
I cannot see any equivalent of this in the STL ranges library; please edit the answer if you can.
EDIT: The problem with ranges::any_view is that it is very slow and inefficient. See https://github.com/ericniebler/range-v3/issues/714.
It is desirable to declare a function returning a range in a header and define it in a cpp file
for compilation firewalls (compilation speed)
stop the language server from going crazy
for better factoring of the code
However, there are complications that make it not advisable:
How to get type of a view?
If defining it in a header is fine, use auto
If performance is not a issue, I would recommend ranges::any_view
Otherwise I'd say it is not advisable.

C++ class design: dynamic typing alternative to template argument?

I would like to build a space-efficient modular arithmetic class. The idea is that the modulus M is an immutable attribute that gets fixed during instantiation, so if we have a large array (std::vector or another container) of values with the same M, M only needs to be stored once.
If M can be fixed at compile time, this can be done using templates:
template <typename num, num M> class Mod_template
{
private:
num V;
public:
Mod_template(num v=0)
{
if (M == 0)
V = v;
else
{
V = v % M;
if (V < 0)
V += M;
}
}
// ...
};
Mod_template<int, 5> m1(2); // 2 mod 5
However, in my application, we should be able to express M runtime. What I have looks like this:
template <typename num> class Mod
{
private:
const num M;
num V;
public:
Mod(num m, num v=0): M(abs(m))
{
if (M == 0)
V = v;
else
{
V = v % M;
if (V < 0)
V += M;
}
}
// ...
};
Mod<int> m2(5, 2); // 2 mod 5
Mod<int> m3(3); // 0 mod 3
This works, but a large vector of mod M values uses 2x the space it needs to.
I think the underlying conceptual problem is that Mod's of different moduli are syntactically of the same type even though they "should" be different types. For example, a statement like
m2 = m3;
should raise a runtime error "naturally" (in my version, it does so "manually": check is built into the copy constructor, as well as every binary operator I implement).
So, is there a way to implement some kind of dynamic typing so that the Mod object's type remembers the modulus? I'd really appreciate any idea how to solve this.
This is a recurring problem for me with various mathematical structures (e.g. storing many permutations on the same set, elements of the same group, etc.)
EDIT: as far as I understand,
templates are types parametrized by a class or literal.
what I want: a type parametrized by a const object (const num in this case, const Group& or const Group *const for groups, etc.).
Is this possible?
It will be difficult to do it in zero storage space if the class needs to know what M should be without any outside help. Likely the best you can do is store a pointer to a shared M, which may be a little better depending on how large num is. But it's not as good as free.
It will be easier to design if M is a passed-in value to all the functions that need it. Then you can do things like make a pool of objects that all share the same M (there are plenty of easy ways to design this; e.g. map<num, vector<num> >) and only store M once for the pool. The caller will need to know which pool the Mod object came from, but that's probably something it knows anyway.
It's hard to answer this question perfectly in isolation... knowing more about the calling code would definitely help you get better answers.

Lambda closure vs simple argument?

For lambda expressions, I don't quite get the usefulness of closures in C++11.
auto f = [] (int n, int m) { return n + m };
std::cout << f(2,2);
versus.
int n = 2;
auto f = [n] (int m) { return n + m };
std::cout << f(2);
This is a very basic and primitive example. I'm guessing that closures play an important part in other kinds of statements, but my C++ book doesn't clarify this (so far).
Why not include the closure as a parameter?
OK, a simple example, remove all the x's from a string
char x = 'x';
std::string s = "Text to remove all 'x's from";
s.erase(std::remove_if(s.begin(), s.end(), [x](char c) {return x == c;}), s.end());
Borrowed and modifed from http://en.cppreference.com/w/cpp/algorithm/remove
In this example, remove_if() only takes a single parameter, but I need two values for the comparison.
Closures are not always called immediately. They are objects which can be stored and called later when the data necessary to successfully execute the lambda function may no longer be in scope or easily accessible from the call site.
It's possible to to store any necessary data along with the closure but it's so much simpler for the closure to grab anything it needs when it's created and use it when it's eventually called. It provides a form of encapsulation.
This also decreases code coupling because if you were to store the data along with the code then the caller could only work with the specific objects you decided to store. Since a closure carries its own data along with it, it can work with any data it needs.
Here's an greatly oversimplified real-life example. I built a database server which needed to support fields with multiple values. The problem was that when results were displayed, it was important to highlight which values actually caused a record to match the search criteria. So, the query parser would spit out a predicate in the form of a closure which would indicate whether or not it was a matching value.
It looked something like this:
std::function< bool(int value) > parser::match_int(int search_val) {
return [=](int value) { value == search_val; };
}
That closure got stored in a collection. When it was time to render the record, I could easily determine which values needed to be highlighted. Keep in mind that the parser and any associated data is now gone:
void render_values(std::function< bool(int value) > pred, std::vector<int> values) {
for (int value : values) {
if (pred(value))
render_highlight(value);
else
render_normal(value);
}
}

Bitmasks: Initialize an int bitmask with a variable length list of ints AND int-ranges

Foreword: I find it annoying that an answer is marked as duplicate without actually checking if it solves one's problem. I've asked this question before, but didn't succeed. In particular, this question is not answered by Implementing Matlab's colon : operator in C++ expression templates class. It is fixed number of variables vs. variable number of variables + variable combination of ranges of integers and integer values. If you still think the right direction, then please provide a valid solution to my stated problem or just don't mark it simply as duplicate if you can't do so.
=> Please check back if it's answering my question, before you just hit that duplicate button. Thanks.
I'm trying to find the most elegant way to define a bitmask. This bitmask is an integer, and it defines the visibility of objects on a map (32 levels, so bits 0..31 define visibility on each of the 32 levels). The bitmaks creation helper should be able to handle a variable length list of integers, as well as integer ranges - AND any combination of these.
E.g. it could then look like:
int visibilityMask = CreateVisibilityMask([1..12], 16, 22);
I guess this one is really tough. But is it impossible?
According to Sarien Answer:
class MaskCreator
{
public:
MaskCreator& AddRange(int from,int to){
for(int i= from; i<=to; ++i){
m_list.push_back(i);
}
return *this;
}
MaskCreator& Add(int i){
m_list.push_back(i);
return *this;
}
MaskCreator& AddMulti( varargstuff ){
m_list.push_back(i);
return *this;
}
unsigned int GetMask();
private:
vector<int> m_list;
}
// usage:
unsigned int mask = MaskCreator().Add(3).Add(7).AddRange(16,25).AddMulti(28,30,31).GetMask();
obviously the AddMulti could replace the Add;
Make it VisibilityMask(Range(1, 12)).Add(16).Add(22) and it will be easy.
You only need to provide a constructor and an Add function with an overload for a Range object (or two ints) and for one int. If you return the current object you can chain the calls like this.
You could also overload operator<< or operator() if you want prettier syntax. But I don't think it is a good idea. I would only overload those for output (<<) and functors (()) to make your code more readable.
You could probably overload the comma operator or some such shenanigans but I do not recommend it. Stick to idiomatic C++, even if it means that Range(1, 12) now has 4 characters more than [1..12]. You should also not use variadic functions (see here).
may be by using varags
I do not use them often can be a vararg list contain vectors?
if yes you can possible do something like this:
std::vector<int> SingleIndices( int vararglist)
{
std::vector<int> vec;
/// push all ints
return vec;
}
std::vector<int> Range(int from, int to)
{
std::vector<int> vec;
for(int i= from; i<=to; ++i){
vec.push_back(i);
}
return vec;
}
int CreateVisibilityMask( vectorvarargs );
//usage:
unsigned int mask = CreateVisibilityMask(Range(1,5),SingleIndices(8,9,23), Range(25, 30));
at all this seems to be over engeneered and not very efficient.
another nice option that came me into mind is simply use a var-arg int list in the followin form:
int mask = getBitMask(1, 4, 6,-10, 15, 20,-24);
then you procesi the args and found a negative numer then this means that all indices between the last number and the abs of current number forms a range.
Only drawback is, that there is no semantic compiletime check that prohibits things like
int mask = getBitMask(-2, 4, 6); // allthough this may be valid and just mean bit 0 to 2
or
int mask = getBitMask(4, -6, -10);

Understanding std::accumulate

I want to know why std::accumulate (aka reduce) 3rd parameter is needed. For those who do not know what accumulate is, it's used like so:
vector<int> V{1,2,3};
int sum = accumulate(V.begin(), V.end(), 0);
// sum == 6
Call to accumulate is equivalent to:
sum = 0; // 0 - value of 3rd param
for (auto x : V) sum += x;
There is also optional 4th parameter, which allow to replace addition with any other operation.
Rationale that I've heard is that if you need let say not to add up, but multiply elements of a vector, we need other (non-zero) initial value:
vector<int> V{1,2,3};
int product = accumulate(V.begin(), V.end(), 1, multiplies<int>());
But why not do like Python - set initial value for V.begin(), and use range starting from V.begin()+1. Something like this:
int sum = accumulate(V.begin()+1, V.end(), V.begin());
This will work for any op. Why is 3rd parameter needed at all?
You're making a mistaken assumption: that type T is of the same type as the InputIterator.
But std::accumulate is generic, and allows all different kinds of creative accumulations and reductions.
Example #1: Accumulate salary across Employees
Here's a simple example: an Employee class, with many data fields.
class Employee {
/** All kinds of data: name, ID number, phone, email address... */
public:
int monthlyPay() const;
};
You can't meaningfully "accumulate" a set of employees. That makes no sense; it's undefined. But, you can define an accumulation regarding the employees. Let's say we want to sum up all the monthly pay of all employees. std::accumulate can do that:
/** Simple class defining how to add a single Employee's
* monthly pay to our existing tally */
auto accumulate_func = [](int accumulator, const Employee& emp) {
return accumulator + emp.monthlyPay();
};
// And here's how you call the actual calculation:
int TotalMonthlyPayrollCost(const vector<Employee>& V)
{
return std::accumulate(V.begin(), V.end(), 0, accumulate_func);
}
So in this example, we're accumulating an int value over a collection of Employee objects. Here, the accumulation sum isn't the same type of variable that we're actually summing over.
Example #2: Accumulating an average
You can use accumulate for more complex types of accumulations as well - maybe want to append values to a vector; maybe you have some arcane statistic you're tracking across the input; etc. What you accumulate doesn't have to be just a number; it can be something more complex.
For example, here's a simple example of using accumulate to calculate the average of a vector of ints:
// This time our accumulator isn't an int -- it's a structure that lets us
// accumulate an average.
struct average_accumulate_t
{
int sum;
size_t n;
double GetAverage() const { return ((double)sum)/n; }
};
// Here's HOW we add a value to the average:
auto func_accumulate_average =
[](average_accumulate_t accAverage, int value) {
return average_accumulate_t(
{accAverage.sum+value, // value is added to the total sum
accAverage.n+1}); // increment number of values seen
};
double CalculateAverage(const vector<int>& V)
{
average_accumulate_t res =
std::accumulate(V.begin(), V.end(), average_accumulate_t({0,0}), func_accumulate_average)
return res.GetAverage();
}
Example #3: Accumulate a running average
Another reason you need the initial value is because that value isn't always the default/neutral value for the calculation you're making.
Let's build on the average example we've already seen. But now, we want a class that can hold a running average -- that is, we can keep feeding in new values, and check the average so far, across multiple calls.
class RunningAverage
{
average_accumulate_t _avg;
public:
RunningAverage():_avg({0,0}){} // initialize to empty average
double AverageSoFar() const { return _avg.GetAverage(); }
void AddValues(const vector<int>& v)
{
_avg = std::accumulate(v.begin(), v.end(),
_avg, // NOT the default initial {0,0}!
func_accumulate_average);
}
};
int main()
{
RunningAverage r;
r.AddValues(vector<int>({1,1,1}));
std::cout << "Running Average: " << r.AverageSoFar() << std::endl; // 1.0
r.AddValues(vector<int>({-1,-1,-1}));
std::cout << "Running Average: " << r.AverageSoFar() << std::endl; // 0.0
}
This is a case where we absolutely rely on being able to set that initial value for std::accumulate - we need to be able to initialize the accumulation from different starting points.
In summary, std::accumulate is good for any time you're iterating over an input range, and building up one single result across that range. But the result doesn't need to be the same type as the range, and you can't make any assumptions about what initial value to use -- which is why you must have an initial instance to use as the accumulating result.
The way things are, it is annoying for code that knows for sure a range isn't empty and that wants to start accumulating from the first element of the range on. Depending on the operation that is used to accumulate with, it's not always obvious what the 'zero' value to use is.
If on the other hand you only provide a version that requires non-empty ranges, it's annoying for callers that don't know for sure that their ranges aren't empty. An additional burden is put on them.
One perspective is that the best of both worlds is of course to provide both functionality. As an example, Haskell provides both foldl1 and foldr1 (which require non-empty lists) alongside foldl and foldr (which mirror std::transform).
Another perspective is that since the one can be implemented in terms of the other with a trivial transformation (as you've demonstrated: std::transform(std::next(b), e, *b, f) -- std::next is C++11 but the point still stands), it is preferable to make the interface as minimal as it can be with no real loss of expressive power.
Because standard library algorithms are supposed to work for arbitrary ranges of (compatible) iterators. So the first argument to accumulate doesn't have to be begin(), it could be any iterator between begin() and one before end(). It could also be using reverse iterators.
The whole idea is to decouple algorithms from data. Your suggestion, if I understand it correctly, requires a certain structure in the data.
If you wanted accumulate(V.begin()+1, V.end(), V.begin()) you could just write that. But what if you thought v.begin() might be v.end() (i.e. v is empty)? What if v.begin() + 1 is not implemented (because v only implements ++, not generized addition)? What if the type of the accumulator is not the type of the elements? Eg.
std::accumulate(v.begin(), v.end(), 0, [](long count, char c){
return isalpha(c) ? count + 1 : count
});
It's indeed not needed. Our codebase has 2 and 3-argument overloads which use a T{} value.
However, std::accumulate is pretty old; it comes from the original STL. Our codebase has fancy std::enable_if logic to distinguish between "2 iterators and initial value" and "2 iterators and reduction operator". That requires C++11. Our code also uses a trailing return type (auto accumulate(...) -> ...) to calculate the return type, another C++11 feature.