C++ lazy way to find union of std::sets - c++

I want to find the union of two sets, ie add them together. I am aware of .insert() and std::set_union(), but, as far as I can tell, these require one to first obtain an iterator to the beginning and end of the second set (or worse for set_union()). It would be nice if I could just do something like + and +=; it seems like that would be a fairly obvious feature for a class that implements the mathematical concept of a set. What is the easiest way to do this?

I don't know of any ways to simplify it with existing C++ methods.
One way to simplify container algorithms that operate on the whole container is to wrap them in a template method accepting a container:
template <typename T>
void my_union(const T& cont1, const T& cont2, T& cont3)
{
// Make the union and store the result in cont3
}
If you want to have an operator for this, you can easily define one yourself:
template <typename T>
inline set<T>& operator+=(set<T>& lhs, const set<T>& rhs)
{
lhs.insert(begin(rhs), end(rhs));
return lhs;
}
template <typename T>
inline set<T> operator+(set<T> lhs, const set<T>& rhs)
{
lhs += rhs;
return lhs;
}
int main() {
set<int> a = {1, 2, 3 };
set<int> b = { 2, 3, 4};
a += b;
for (auto i : a)
cout << i << " ";
return 0;
}
The above example will print 1 2 3 4 to the console.

Related

Hashable type with overwritten operators or external functors

To use a custom type in a std::unordered_set I have to options.
1) Implement the == operator for my type and specialize std::hash
struct MyType {
int x;
bool operator==(const MyType& o) {
return this.x == o.x;
}
};
namespace std
{
template<>
struct hash<MyType> {
size_t operator()(const MyType& o) const {
return hash<int>()(o.x);
}
};
}
std::unordered_set<MyType> mySet;
Or 2), provide functor classes:
struct MyTypeHash {
size_t operator()(const MyType& o) const {
return std::hash<int>()(o.x);
}
};
struct MyTypeCompare {
bool operator()(const MyType& o1, const MyType& o2) const {
return o1.x == o2.x;
}
};
std::unordered_set<MyType, MyTypeHash, MyTypeCompare> mySet;
The second approach lets me choose new behaviour for every new instantion of std::unordered_set, while with the first approach the behaviour as being part of the type itself will always be the same.
Now, if I know that I only ever want a single behaviour (I'll never define two different comparators for MyType), which approach is to be preferred? What other differences exist between those two?
Attaching the behavior to the type allows for code like
template<template<class> Set,class T>
auto organizeWithSet(…);
/* elsewhere */ {
organizeWithSet<std::unordered_set,MyType>(…);
organizeWithSet<std::set,MyType>(…);
}
which obviously cannot pass custom function objects.
That said, it is possible to define
template<class T>
using MyUnorderedSet=std::unordered_set<T, MyTypeHash,MyTypeCompare>;
and use that as a template template argument, although that introduces yet another name and might be considered less readable.
Otherwise, you have to consider that your operator== is simultaneously the default for std::unordered_set and std::find, among others; if the equivalence you want for these purposes varies, you probably want named comparators. On the other hand, if one suffices, C++20 might even let you define it merely with =default.

How to promote two template types for arithmitic operations like builtin types do?

If I have a generic struct/class:
template<typename T>
struct Container
{
T value;
Container(const Value& value) : value(value) { }
};
And I want to perform an operation on two of them:
template<typename T, typename U>
Container<T> operator+(const Container<T>& lhs, const Container<U>& rhs)
{
return Container<T>(lhs.value + rhs.value);
}
The problem is that if lhs is of the type Container<int> and rhs is of the type Container<float>, then I'll get a Container<int> back. But if I were to do auto result = 2 + 2.0f, then result would of of type float. So the behavior is inconsistent between builtin types and custom types.
So how would I take the operator+ overload and make it return Container<float>, much like how C++ handles arithmetic promotion with builtin types?
As far as you use one of the two types of the template, you risk to induce a cast on the result of the sum. As an example, if you accidentally choose int as your target type, even though the sum results in a float, it will be cast down to the chosen type.
Anyway, starting with C++11, you con rely on the decltype specifier as in the example above (or at least, you can do that if Container::T and Container::U are a types for which the + operator is defined).
I used also the auto specifier as return value for the operator+, for it is at our disposal starting from C++14 and it's really useful indeed.
It follows the working example above mentioned:
#include <iostream>
#include <vector>
template<typename T>
struct Container
{
T value;
Container(const T& value) : value(value) { }
};
template<typename T, typename U>
auto operator+(const Container<T>& lhs, const Container<U>& rhs)
{
return Container<decltype(lhs.value+rhs.value)>{lhs.value + rhs.value};
}
int main()
{
Container<int> c1{1};
Container<float> c2{0.5};
std::cout << (c1+c2).value << std::endl;
}

Making return-type dependent on source of invocation?

I wrote a c++-class that represents a mathematical matrix of arbitrary dimension NxM. Furthermore I also wrote a vector-class, deriving from it...
template<size_t N, size_t M>
class matrix{ ... };
template<size_t N>
class vector : public matrix<N,1>{ ... };
...so that an N-vector can be treated as an Nx1-matrix, for example when it comes to multiplying with integral values or addition/subtraction of equally dimensioned matrices (or vectors in this regard).
The idea behind this is to avoid repeating code - which generally is a noble goal, I think. But here is the problem arising from it:
Here is your operator-overload for the addition, which only exists in the matrix-class:
matrix<N,M> operator+(const matrix<N,M>& right){
//calculate some result and use it to construct a new instance
return matrix<N,M>(result);
}
Making sure, the vector-class offers a copy-constructor for it's matrix-representation, it should be possible to say something like this:
vector<3> a(1,2,3);
vector<3> b(3,2,1);
a = a+b;
but you can't say this:
(a+b).some_vector_instance_method();
...because (a+b) isn't a vector.
QUESTION: Is it possible to implement the matrix-operator+, so that it makes the return-type dependent on it's source of invocation? So, basically, if you invoke the + on a matrix, it should return a matrix; if invoked on a vector, it should return a vector.
Now you can do this:
template<typename D>
D operator+(const D& right){
//calculate result as usual, relying on 'right' to have what it takes
return D(result);
}
... but it is unsafe as hell.
Any ideas?
The simple approach to implementation is to implement a member operator+=() for both matrix<M, N> and for vector<M> where the latter simply delegates to the former an the matrix operator has the actual operation. Using a bit of tagging the operator+() is then implemented as a non-member operator in terms of these operator. Here is a brief sketch:
#include <iostream>
namespace matrix_operators
{
struct tag {};
template <typename T>
T operator+ (T const& lhs, T const& rhs) {
return T(lhs) += rhs;
}
}
template<size_t N, size_t M>
class matrix
: matrix_operators::tag
{
public:
matrix<N, M>& operator+= (matrix<N, M> const&) {
std::cout << "matrix<" << N << ", " << M << "::operator+=()\n";
return *this;
}
};
template<size_t N>
class vector:
public matrix<N,1>
{
public:
vector<N>& operator+= (vector<N> const& other) {
matrix<N, 1>::operator+= (other);
return *this;
}
void some_other_method() {
std::cout << "vector<" << N << ">::some_other_method()\n";
}
};
int main()
{
vector<3> a, b;
(a + b).some_other_method();
}

Fold arbitrary number of pairs of iterators into a new iterator. Metaprogramming for a nice syntax?

I have an algorithm that takes two ranges and returns a range that iterates, computing on the fly, a special subset of elements in the first range based on the contents of the second. The special subset itself can in turn be run through this algorithm on another set. Everything works fine, but I'm banging my head against the wall trying to improve the api with variadic templates. The final clause of the main function below illustrates the goal.
template <class ContainerLeft, class ContainerRight>
class joincalc_const_iter : public std::iterator<std::input_iterator_tag, typename ContainerLeft::difference_type> {
public:
joiner_const_iter& operator++(); /* does complicated stuff to find the next member of a subset in left. */
const typename ContainerLeft::value_type& operator*() const;
const ContainerLeft* left = nullptr;
const ContainerRight* right = nullptr;
...
};
template <class ContainerLeft, class ContainerRight>
class JoinCalc {
public:
typedef joincalc_const_iter<ContainerLeft, ContainerRight> const_iterator;
const_iterator begin() const;
const_iterator end() const;
...
};
template<class L, class R>
JoinCalc<L, R> join(const L& left, const R& right)
{
return JoinCalc<L, R>(left, right);
}
int main()
{
SomeSequence a{...}, b{...};
SomeSequenceDifferentType c{...}, d{...};
/* Works great. */
for (const auto& n : join(a, c))
std::cout << n << "\n";
for (const auto& n : join(a, b))
std::cout << n << "\n";
/* Works, but is a pain to write. I'm trying and failing at using variadic
* templates to automate this. The goal is to write: join(a, b, c, d); */
for (const auto& n : join(join(join(a, b), c), d))
std::cout << n << "\n";
}
I suppose one could resort to macros, but it seems like what I'm shooting for should be possible with variadic templates. I'm just not sharp enough to figure it out and I get confused by the errors. Is there a way to do it with just a template function? Or do you have to build a tuple-like thing with container semantics? If so, how?
In addition to this,
template<class L, class R>
JoinCalc<L, R> join(const L& left, const R& right)
{
return JoinCalc<L, R>(left, right);
}
define this also,
//it is an overload, not specialization
template<class L, class R, class ...Rest>
auto join(const L& left, const R& right, Rest const & ... rest)
-> decltype(join(JoinCalc<L, R>(left, right), rest...))
{
return join(JoinCalc<L, R>(left, right), rest...);
}
Note the trailing-return-type.
How does it work?
If there are more than 2 arguments, the second overload will be invoked, else the first overload will be invoked.
By the way, I would suggest you to accept the arguments as universal references, and use std::forward to forward the arguments to the constructor and other overload.

Chaining of ordering predicates (e.g. for std::sort)

You can pass a function pointer, function object (or boost lambda) to std::sort to define a strict weak ordering of the elements of the container you want sorted.
However, sometimes (enough that I've hit this several times), you want to be able to chain "primitive" comparisons.
A trivial example would be if you were sorting a collection of objects that represent contact data. Sometimes you will want to sort by last name, first name, area code. Other times first name, last name - yet other times age, first name, area code... etc
Now, you can certainly write an additional function object for each case, but that violates the DRY principle - especially if each comparison is less trivial.
It seems like you should be able to write a hierarchy of comparison functions - the low level ones do the single, primitive, comparisons (e.g. first name < first name), then higher level ones call the lower level ones in succession (probably chaining with && to make use of short circuit evaluation) to generate the composite functions.
The trouble with this approach is that std::sort takes a binary predicate - the predicate can only return a bool. So if you're composing them you can't tell if a "false" indicates equality or greater than. You can make your lower level predicates return an int, with three states - but then you would have to wrap those in higher level predicates before they could be used with std::sort on their own.
In all, these are not insurmountable problems. It just seems harder than it should be - and certainly invites a helper library implementation.
Therefore, does anyone know of any pre-existing library (esp. if it's a std or boost library) that can help here - of have any other thoughts on the matter?
[Update]
As mentioned in some of the comments - I've gone ahead and written my own implementation of a class to manage this. It's fairly minimal, and probably has some issues with it in general. but on that basis, for anyone interested, the class is here:
http://pastebin.com/f52a85e4f
And some helper functions (to avoid the need to specify template args) is here:
http://pastebin.com/fa03d66e
You could build a little chaining system like so:
struct Type {
string first, last;
int age;
};
struct CmpFirst {
bool operator () (const Type& lhs, const Type& rhs) { return lhs.first < rhs.first; }
};
struct CmpLast {
bool operator () (const Type& lhs, const Type& rhs) { return lhs.last < rhs.last; }
};
struct CmpAge {
bool operator () (const Type& lhs, const Type& rhs) { return lhs.age < rhs.age; }
};
template <typename First, typename Second>
struct Chain {
Chain(const First& f_, const Second& s_): f(f_), s(s_) {}
bool operator () (const Type& lhs, const Type& rhs) {
if(f(lhs, rhs))
return true;
if(f(rhs, lhs))
return false;
return s(lhs, rhs);
}
template <typename Next>
Chain <Chain, Next> chain(const Next& next) const {
return Chain <Chain, Next> (*this, next);
}
First f;
Second s;
};
struct False { bool operator() (const Type& lhs, const Type& rhs) { return false; } };
template <typename Op>
Chain <False, Op> make_chain(const Op& op) { return Chain <False, Op> (False(), op); }
Then to use it:
vector <Type> v; // fill this baby up
sort(v.begin(), v.end(), make_chain(CmpLast()).chain(CmpFirst()).chain(CmpAge()));
The last line is a little verbose, but I think it's clear what's intended.
One conventional way to handle this is to sort in multiple passes and use a stable sort. Notice that std::sort is generally not stable. However, there’s std::stable_sort.
That said, I would write a wrapper around functors that return a tristate (representing less, equals, greater).
You can try this:
Usage:
struct Citizen {
std::wstring iFirstName;
std::wstring iLastName;
};
ChainComparer<Citizen> cmp;
cmp.Chain<std::less>( boost::bind( &Citizen::iLastName, _1 ) );
cmp.Chain<std::less>( boost::bind( &Citizen::iFirstName, _1 ) );
std::vector<Citizen> vec;
std::sort( vec.begin(), vec.end(), cmp );
Implementation:
template <typename T>
class ChainComparer {
public:
typedef boost::function<bool(const T&, const T&)> TComparator;
typedef TComparator EqualComparator;
typedef TComparator CustomComparator;
template <template <typename> class TComparer, typename TValueGetter>
void Chain( const TValueGetter& getter ) {
iComparers.push_back( std::make_pair(
boost::bind( getter, _1 ) == boost::bind( getter, _2 ),
boost::bind( TComparer<TValueGetter::result_type>(), boost::bind( getter, _1 ), boost::bind( getter, _2 ) )
) );
}
bool operator()( const T& lhs, const T& rhs ) {
BOOST_FOREACH( const auto& comparer, iComparers ) {
if( !comparer.first( lhs, rhs ) ) {
return comparer.second( lhs, rhs );
}
}
return false;
}
private:
std::vector<std::pair<EqualComparator, CustomComparator>> iComparers;
};
std::sort is not guaranteed to be stable because stable sorts are usually slower than non-stable ones ... so using a stable sort multiple times looks like a recipe for performance trouble...
And yes it's really a shame that sort ask for a predicate:
I see no other way than create a functor accepting a vector of tristate functions ...
The chaining solution is verbose. You could also use boost::bind in conjunction with std::logical_and to build your sorting predicate. See the linked article for more information: How the boost bind library can improve your C++ programs
Variadic templates in C++ 11 give a shorter option:
#include <iostream>
using namespace std;
struct vec { int x,y,z; };
struct CmpX {
bool operator() (const vec& lhs, const vec& rhs) const
{ return lhs.x < rhs.x; }
};
struct CmpY {
bool operator() (const vec& lhs, const vec& rhs) const
{ return lhs.y < rhs.y; }
};
struct CmpZ {
bool operator() (const vec& lhs, const vec& rhs) const
{ return lhs.z < rhs.z; }
};
template <typename T>
bool chained(const T &, const T &) {
return false;
}
template <typename CMP, typename T, typename ...P>
bool chained(const T &t1, const T &t2, const CMP &c, P...p) {
if (c(t1,t2)) { return true; }
if (c(t2,t1)) { return false; }
else { return chained(t1, t2, p...); }
}
int main(int argc, char **argv) {
vec x = { 1,2,3 }, y = { 2,2,3 }, z = { 1,3,3 };
cout << chained(x,x,CmpX(),CmpY(),CmpZ()) << endl;
return 0;
}