Set custom comparison type - c++

I am using multisets (sets would be the same), and have them as arguments to a bunch of functions. My functions looked like this:
void insert(const int val, multiset<int>& low, multiset<int>& high)
Then I found out I needed a custom comparison function for one of the multisets. I did it declaring a struct and overriding the () operator.
My multiset definition that once was: multiset<int> low now is multiset<int, order> low.
The problem with this is that I'm actually changing the type of low, and thus I need to change it in every single parameter, which greatly reduces the generality of my functions (the functions do not need to know the comparison method of the multiset).
Moreover, order is one comparison function, which is different from any other comparison functions I might ever declare (even though the types it compare are the exact same).
What I mean is that multiset<int, order1> != multiset<int, order2>, which is very bad.
So, my question is, how can I not have this problem? How can I declare functions that accept multisets (or sets) regardless of their comparison function?

You could use function templates:
template <typename M1, typename M2>
void insert(const int val, M1& low, M2& high);
Another option, if you want to restrict yourself to std::multiset<int, X>, is to use template template parameters.

If possible, I would use templates to take an arbitrary container or iterators.
If you really need it to not be a template and be able to deal with different types of multiset, boost::any_range provides a type-erased container abstraction that might be useful.

Related

Different types for `std::sort` comparator in C++

When we provide a comparator function for std::sort, we use the following overload:
template< class RandomIt, class Compare >
void sort( RandomIt first, RandomIt last, Compare comp );
in which the comparator function for std::sort should have the following syntax:
bool cmp(const Type1 &a, const Type2 &b);
But as you can see a and b may have different types. cppreference says:
The types Type1 and Type2 must be such that an object of type RandomIt
can be dereferenced and then implicitly converted to both of them. ​
But I still cannot understand exactly how we can have 2 different types in a single array when we try to sort it.
Is it possible for someone to provide a small example with different types for std::sort's comparator function?
Its not about what is stored in the array, only one type can ever be stored. It is about what the comparator function is. Take for example this:
struct Animal {};
struct Cat : Animal {};
struct Dog : Animal {};
struct Hound : Dog {};
bool cmp(const Animal &a, const Animal &b);
Even if you have a list of Dogs, Cats or Hounds you can still sort them with the function cmp because they are all implicitly convertible. ie.
std::vector<Hound> hounds;
... // fill hounds
std::sort(hounds.begin(), hounds.end(), cmp);
And you can even imagine cases where Type1 and Type2 are not the same, eg.:
bool cmp(const Animal &a, const Dog &b);
etc ...
Although this would be exceedingly rare.
The types Type1 (Animal) and Type2 (Dog) must be such that an object of type RandomIt (Hound) can be dereferenced and then implicitly converted to both of them. ​Which is true.
The point is that a restriction on the types that a cmp function can take to the same, precludes generality. In some cases this is a good idea, but in this case it would be unreasonably strict and may force problems for edge case implementations. Furthermore, the cmp function used instd::sort is bound by the requirements set out for Compare (probably for simplicity). Compare requirements are used for all sorts of other things, like std::max.
But I still cannot get it exactly how we can have 2 different types in a single array when we try to sort it.
You can't have two different types in an array. The comparator doesn't suggest it's possible. It's specified like that simply because:
The code can be well formed when the types are not the same.
Demanding the same type is a restriction that serves little to no purpose.
So the specification offers a looser contract than is "obvious", in order to help our code be more flexible if needed. As a toy example, say we have this comparator laying around:
auto cmp(int a, long b) -> bool { return a < b; }
Why prevent us from using this perfectly legal (albeit silly) function to sort an array of integers?
But I still cannot get it exactly how we can have 2 different types in a single array when we try to sort it.
You can't.
But the requirements of Compare are not just for sorting arrays, or just for sorting at all!
They're for any time you want to compare one thing to another thing.
Is minutes(42) less than hours(1)? Yes! You may find useful a comparator for such occasions.
Compare is a more general concept that finds uses throughout the language.
Is ti possible that someone provide a small example with different types for std::sort's comparator function
Others have shown examples that indicate how silly you have to get to find a "useful" example to use against std::sort specifically.
But it's not "std::sort's comparator function". It's a comparator function, which you just so happen to be using with std::sort.
It's true that, when doing so, you probably want the particular comparator that you pick to accept operands of the same type.
But I still cannot get it exactly how we can have 2 different types in a single array
You cannot have two different types in a single array.
An array can have objects of only single type. But that single type must be implicitly convertible to both argument types of cmp.
Is ti possible that someone provide a small example with different types for std::sort's comparator function?
Here you go:
int arr[] = {1, 2, 3, 0};
auto cmp = [](const int &a, const long &b) {
return a < b;
};
std::sort(std::begin(arr), std::end(arr), cmp);
Note the two different arguments of cmp. This is just a minimal example, which is technically correct, but admittedly nonsensical. Frankly, I've never encountered a case where it would be useful to have different types for the arguments of a comparison function.
The requirements for a comparator are far looser than you think:
It must accept two dereferenced iterators into the sequence as arguments.
Using an implicit conversion-sequence is fine.
The return-value must be contextually-convertible to bool.
An explicit conversion-operator works just fine.
It must be a copyable and nothrow-destructible complete type.
It must not modify the arguments, so it doesn't interfere with the calling algorithm.
That does not in any way imply the use of constant references if references are used at all.
It must induce a full weak order (cmp(a, b) implies !cmp(b, a), cmp(a, b) && cmp(b, c) implies cmp(a, c)).
So, a valid but fairly useless comparator would be:
template <class... X>
auto useless(X&&...) { return nullptr; }
The type requirements on Compare are not saying much about the elements of the sequence you are sorting, but instead they are allowing all comps for which
if (comp(*first, *other))
is valid.
Most of the time, Type1 will be equal to Type2, but they are not required to be equal.

Why can we not access elements of a tuple by index?

tuple <int, string, int> x=make_tuple(1, "anukul", 100);
cout << x[0]; //1
cout << get<0>(x); //2
2 works. 1 does not.
Why is it so?
From Lounge C++ I learnt that it is probably because the compiler does not know what data type is stored at that index.
But it did not make much sense to me as the compiler could just look up the declaration of that tuple and determine the data type or do whatever else is done while accessing other data structures' elements by index.
Because [] is an operator (named operator[]), thus a member function, and is called at run-time.
Whereas getting the tuple item is a template mechanism, it must be resolved at compile time. Which means this can be only done with the <> templating syntax.
To better understand, a tuple may store different types. A template function may return different types depending on the index passed, as this is resolved at compile time.
The operator[] must return a unique type, whatever the value of the passed parameter is. Thus the tuple functionality is not achievable.
get<0>(x) and get<1>(x) are two different functions generated at compile time, and return different types. The compiler generates in fact two functions which will be mangled to something like
int get_tuple_int_string_int_0(x)
and
string get_tuple_int_string_int_1(x)
The other answers here address the issue of why this isn't possible to implement, but it's also worth asking the question of whether it should be possible. (The answer is no.)
The subscript operator [] is semantically supposed to indicate dynamically-resolved access to a element of a collection, such as an array or a list (of any implementation). The access pattern generally implies certain things: the number of elements probably isn't known to the surrounding code, which element is being accessed will probably vary at runtime, and the elements are all of the same observable type (thus, to the calling code, interchangeable).
Thing is, a tuple isn't (that kind of) a collection. It's actually an anonymous struct, and its elements aren't interchangeable slots at all - semantically, they are regular fields. What's probably throwing you off is that they happen to be labelled with numbers, but that's really just an anonymous naming pattern - analogous to accessing the elements as x._0, x._1, etc. (The fact you can compute the field names at compile-time is a coincidental bonus enabled by C++'s type system, and is not fundamentally related to what a tuple is; tuples, and this answer, are not really specific to C++.)
So it doesn't support operator[] for the same reason that plain old structs don't support operator[]: there's no semantically-valid use for it in this context. Structures have a fixed set of fields that aren't interchangeable or dynamically computable, and since the tuple is a structure, not a collection, it follows the same rule. Its field names just look different.
It can be supported, it just needs to take a compile-time index. Since parameters of a function cannot be made constexpr, we need to wrap the index within a type and pass that instead. (e.g. std::integral_constant<std::size_t, N>.
The following is an extension of std::tuple that supports operator[].
template <typename... Ts>
class tuple : public std::tuple<Ts...> {
public:
using std::tuple<Ts...>::tuple;
template <std::size_t N>
decltype(auto) operator[](std::integral_constant<std::size_t, N>) {
return std::get<N>(*this);
}
};
It would be used like so:
tuple<int, std::string> x(42, "hello");
std::cout << x[std::integral_constant<std::size_t, 0>{}] << std::endl;
// prints: 42
To mitigate the std::integral_constant crazy, we can use variable template:
template <std::size_t N>
std::integral_constant<std::size_t, N> ic;
With this, we can say:
std::cout << x[ic<1>] << std::endl; // prints: hello
So it could be done. One guess as to why this is currently not available is because features such as std::integral_constant and variable templates may not have existed at the time std::tuple was introduced. As to why it doesn't exist even though those features exist, I would guess it's because no one have yet to proposed it.
It's not very clean supporting operator[] given you can't vary the static return type to match the accessed element. If the Standard Library had incorporated something like boost::any or boost::variant, it would make more sense.
Put another way, if you write something like:
int n = atoi(argv[1]);
int x = x[n];
Then what should it do if n doesn't address an int member of the tuple? To even support checking you'd need to store some manner of RunTime Type Information for tuples, which is extra overhead in the executable/memory.
Containers that support the subscript operator (i.e., operator[]) like std::vector or std::array are collections of homogenous values. Whatever the index provided to the subscript operator is, the value to return is always of the same type. Therefore, those containers can define a member function with the following declaration:
T& operator[](int);
Where T is the type of every element in the collection.
On the other hand, an std::tupe is a collection of heterogeneous values. The return value of a hypothetical subscript operator for std::tuple needs to vary with the index. Therefore, its return type depends on the index.
In the declaration of the operator[] given above, the index is provided as a function argument and therefore may be determined at run time. However, the return type of the function is something that needs to be determined at compile time, not at run time.
Since the return type of such a function depends on the index but must be determined at compile-time, the solution is to define instead a function template that accepts the index as a (non-type) template parameter. This way, the index is provided as a compile-time constant and the return type is able to change with the index:
template<std::size_t I, class... Types>
typename std::tuple_element<I, tuple<Types...>>::type& get(tuple<Types...>&) noexcept;
As you can see, std::get's return type depends on the index, I:
std::tuple_element<I, tuple<Types...>>::type&
Because tuple has no operator "bracket".
Why is it so? You cannot resolve templates based only on the return value. You cannot write
template<typename T>
T tuple::operator [](size_t i) const ;
Which is absolutely necessary to be able to allow statements like x[0]

C++11esque signature of a function computing a scalar of a collection of values

Up to now, I would have guessed that
double mean(ConstIterator startIt, ConstIterator endIt);
is a decent signature for a function computing the mean of a collection of values stored in an std collection.
But with C++11, we have both lamdas and for val : Col.
What is a best practice signature for such a function?
Until we get ranges, nothing much will change in terms of functions taking collections of values.
However, unless the function is specific for certain types, usually this sort of thing is implemented generically:
template<typename Iterator, typename Sentinel>
auto mean(Iterator begin, Sentinel end) { // C++14 deduced return type
// ...
}

Generic container & variable element access

In my variable data, when running "add a variable" script code, how to define a generic container for all types? And what is the generic formula to access them? It's annoying because I have to define a vector template for each type (int float double etc). My variable should contain only and only a generic vector object, whatever it's int, or float or double etc. Is it possible? Any idea?
That's the whole point of the Standard Template Library..
std::vector<int>
std::vector<float>
Two vectors - the same class, templated with different types.
If you want a container of differing type you might want to look at std::tuple
If you want a single vector that contains objects of many different types, then you might want to use boost::any, or possibly boost::variant.
Edit:
oh, you want a container that can fit in any type? Yes you can. But within certain restrictions.
See boost::any and boost::variant. boost::variant would enable you to save data of several types enumerated when declaring the boost::variant. boost::any doesn't make you enumerate all types you want to support, but you need to cast them to get back the value yourself.
In short, you must either store the type information somewhere else ( when using boost::any ), or just support a few types (say, a heterogeneous vector that supports int and double using boost::variant)
template in C++ works exactly by eliminating the need to write the same class for every type.
For example:
// function template
template <class T>
T GetMax (T a, T b) {
T result;
result = (a>b)? a : b;
return (result);
}
this GetMax should work for whatever type that has a > operator. So that is exactly what template is for in C++.
If you need more help on implementing a vector on C++ (which, by the way, is not so simple when it comes to custom types that has its own constructor and destructor. You may need allocator to get un-initialized space), read this (Implementation of Vector in C++).

binary_search, find_if and <functional>

std::find_if takes a predicate in one of it's overloaded function. Binders make it possible to write EqualityComparators for user-defined types and use them either for dynamic comparison or static comparison.
In contrast the binary search functions of the standard library take a comparator and a const T& to the value that should be used for comparison. This feels inconsistent to me and could possibly more inefficient as the comparator has to be called with both arguments every time instead of having the constant argument bound to it. While it could be possible to implement std::binary_search in a way to use std::bind this would require all comparators to inherit from std::binary_function. Most code I've seen doesn't do that.
Is there a possible benefit from letting comparators inherit from std::binary_function when using it with algorithms that take a const T& as a value instead of letting me use the binders? Is there a reason for not providing predicate overloads in those functions?
A single-argument predicate version of std::binary_search wouldn't be able to complete in O(log n) time.
Consider the old game "guess the letter I'm thinking of". You could ask: "Is it A?" "Is it B?".. and so on until you reached the letter. That's a linear, or O(n), algorithm. But smarter would be to ask "Is it before M?" "Is it before G?" "Is it before I?" and so on until you get to the letter in question. That's a logarithmic, or O(log n), algorithm.
This is what std::binary_search does, and to do this in needs to be able to distinguish three conditions:
Candidate C is the searched-for item X
Candidate C is greater than X
Candidate C is less than X
A one-argument predicate P(x) says only "x has property P" or "x doesn't have property P". You can't get three results from this boolean function.
A comparator (say, <) lets you get three results by calculating C < X and also X < C. Then you have three possibilities:
!(C < X) && !(X < C) C is equal to X
C < X && !(X < C) C is less than X
!(C < X) && X < C C is greater than X
Note that both X and C get bound to both parameters of < at different times, which is why you can't just bind X to one argument of < and use that.
Edit: thanks to jpalecek for reminding me binary_search uses <, not <=.
Edit edit: thanks to Rob Kennedy for clarification.
They are completely different algorithms: find_if looks linearly for the first item for which the predicate is true, binary_search takes advantage that the range is sorted to test in logarithmic time if a given value is in it.
The predicate for binary_search specifies the function according to which the range is ordered (you'd most likely want to use the same predicate you used for sorting it).
You can't take advantage of the sortedness to search for a value satisfying some completely unrelated predicate (you'd have to use find_if anyway). Note however, that with a sorted range you can do more than just test for existence with lower_bound, upper_bound and equal_range.
The question, what is the purpose of std::binary_function is an interesting one.
All it does is provide typedefs for result_type, first_argument_type and second_argument_type. These would allow the users, given a functor as a template argument, to find out and use these types, e.g
template <class T, class BinaryFunction>
void foo(const T& a, const T& b, BinaryFunction f)
{
//declare a variable to store the result of the function call
typename BinaryFunction::result_type result = f(a, b);
//...
}
However, I think the only place where they are used in the standard library is creating other functor wrappers like bind1st, bind2nd, not1, not2. (If they were used for other purposes, people would yell at you any time you used a function as a functor since it would be an unportable thing to do.)
For example, binary_negate might be implemented as (GCC):
template<typename _Predicate>
class binary_negate
: public binary_function<typename _Predicate::first_argument_type,
typename _Predicate::second_argument_type, bool>
{
protected:
_Predicate _M_pred;
public:
explicit
binary_negate(const _Predicate& __x) : _M_pred(__x) { }
bool
operator()(const typename _Predicate::first_argument_type& __x,
const typename _Predicate::second_argument_type& __y) const
{ return !_M_pred(__x, __y); }
};
Of course, operator() could perhaps just be a template, in which case those typedefs would be unnecessary (any downsides?). There are probably also metaprogramming techniques to find out what the argument types are without requiring the user to typedef them explicitly. I suppose it would somewhat get into the way with the power that C++0x gives - e.g when I'd like to implement a negator for a function of any arity with variadic templates...
(IMO the C++98 functors are a bit too inflexible and primitive compared for example to std::tr1::bind and std::tr1::mem_fn, but probably at the time compiler support for metaprogramming techniques required to make those work was not that good, and perhaps the techniques were still being discovered.)
This is a misunderstanding of the Functor concept in C++.
It has nothing to do with inheritance. The property that makes an object a functor (eligible for passing to any of the algorithms) is validity of the expression object(x) or object(x, y), respectively, regardless whether it is a function pointer or an object with overloaded function call operator. Definitely not inheritance from anything. The same applies for std::bind.
The use of binary functors as comparators comes from the fact that comparators (eg. std::less) are binary functors and it's good to be able to use them directly.
IMHO there would be no gain in providing or using the predicate version you propose (after all, it takes just passing one reference). There would be no (performance) gain in using binders, because it does the same thing as the algorithm (bind would pass the extra argument in lieu of the algorithm).