C++ STL sort() function, binary predicate

C++ STL sort() function, binary predicate - c++

I have a piece of code that confuses me:
sort(data, data+count, greater<int>() );
it is a sort function in the C standard library. I am having trouble figuring out the meaning of the third argument. I have read that it is called a binary predicate. What does that mean and how can I make my own such predicate?

The third argument is called a predicate. You can think of a predicate as a function that takes a number of arguments and returns true or false.
So for example, here is a predicate that tells you whether an integer is odd:
bool isOdd(int n) {
return n & 1;
}
The function above takes one argument, so you could call it a unary predicate. If it took two arguments instead, you would call it a binary predicate. Here is a binary predicate that tells you if its first argument is greater than the second:
bool isFirstGreater(int x, int y) {
return x > y;
}
Predicates are commonly used by very general-use functions to allow the caller of the function to specify how the function should behave by writing their own code (when used in this manner, a predicate is a specialized form of callback). For example, consider the sort function when it has to sort a list of integers. What if we wanted it to sort all odd numbers before all even ones? We don't want to be forced to write a new sort function each time that we want to change the sort order, because the mechanics (the algorithm) of the sort is clearly not related to the specifics (in what order we want it to sort).
So let's give sort a predicate of our own to make it sort in reverse:
// As per the documentation of sort, this needs to return true
// if x "goes before" y. So it ends up sorting in reverse.
bool isLarger(int x, int y) {
return x > y;
}
Now this would sort in reverse order:
sort(data, data+count, isLarger);
The way this works is that sort internally compares pairs of integers to decide which one should go before the other. For such a pair x and y, it does this by calling isLarger(x, y).
So at this point you know what a predicate is, where you might use it, and how to create your own. But what does greater<int> mean?
greater<T> is a binary predicate that tells if its first argument is greater than the second. It is also a templated struct, which means it has many different forms based on the type of its arguments. This type needs to be specified, so greater<int> is the template specialization for type int (read more on C++ templates if you feel the need).
So if greater<T> is a struct, how can it also be a predicate? Didn't we say that predicates are functions?
Well, greater<T> is a function in the sense that it is callable: it defines the operator bool operator()(const T& x, const T& y) const;, which makes writing this legal:
std::greater<int> predicate;
bool isGreater = predicate(1, 2); // isGreater == false
Objects of class type (or structs, which is pretty much the same in C++) which are callable are called function objects or functors.

There is a class template called greater which needs a type argument. So you provide int as one. It became greater<int> and you create an instance of this class and pass it to the function as third argument.
Such an object is called function object or simply functor in C++, as the class overloads () operator. It's a callable entity. Its like this:
template<typename T>
struct greater
{
bool operator()(const T &a, const T &b)
{
//compare a and b and return either true or false.
return a > b;
}
};
If you create an instance of greater<int> and, say, the object is g, then you can write g(100,200) which evaluates to a boolean value, as the expression g(100,200) calls operator(), passing 100 as first argument and 200 as second argument, and operator() compares them and return either true or false.
std::cout << g(100,200) << std::endl;
std::cout << g(200,100) << std::endl;
Output:
0
1
Online demo : http://ideone.com/1HKfC

A binary predicate is any function/object that receives two objects (hence binary) and returns a bool (hence predicate); the idea is that it evaluates if the two objects satisfy some particular condition - in the example, if one is greater than the other.
You can create a predicate either by just defining a function with the correct signature
bool IsIntGreater(int First, int Second)
{
return First>Second;
}
and passing the name of the function as the argument (this will result in passing a function pointer), or creating a function object (a functor), i.e. an object which overloads the function call operator and thus can be used as a function; the std::greater<T> type is a template functor, and in your snippet a temporary object of type std::greater<int> is created and passed to the std::sort algorithm.
Functors have several advantages over functions, especially when they have to be passed as arguments, have a look here for more information about this.

See comp in http://www.cplusplus.com/reference/algorithm/sort/
It is the function that does the comparison.

Related

What is this "operator" block of code in c++ class

I'm using someone's class for bitmaps, which are ways of storing chess positions in 64-bit bitsets. I was wondering what the part with auto() operator does. Is "auto" used because it returns one bit, which is why a return-type isn't specified for the function? I get that it checks that x and y are in the bounds of the chess board, and asserts an error if they aren't. I also understand that it returns a bit that corresponds to the x,y value pair for the bitset. I also don't get why the function is defined like it is, with an extra pair of parentheses. Any help is appreciated!
class BitBoard {
private:
std::bitset<64> board;
public:
auto operator()(int x, int y) {
assert(0<=x && x<=7);
assert(0<=y && y<=7);
return board[8*y+x];
}
}
};

The "extra" pair of parentheses are because you're defining operator(), which lets instances of your class behave like functions. So if you had a:
BitBoard board;
you could get the value for x=3, y=5 by doing:
board(3, 5)
instead of having a method you call on the board explicitly, like board.get_bit_at(3, 5).
The use of auto just means it deduces the return type from std::bitset<64>'s operator[]; since the method isn't const qualified, this means it's just deducing the std::bitset::reference type that std::bitset's operator[] uses to allow mutations via stuff like mybitset[5] = true;, even though you can't give "true" references to single bits. If reimplemented a second time as a const-qualified operator(), e.g.:
auto operator()(int x, int y) const {
assert(0<=x && x<=7);
assert(0<=y && y<=7);
return board[8*y+x];
}
you could use auto again for consistency, though it doesn't save any complexity (the return type would be bool in that case, matching std::bitset's const-qualified operator[], no harder to type than auto).
The choice to use operator() is an old hack for multidimensional data structures to work around operator[] only accepting one argument; rather than defining operator[] to return a proxy type (which itself implements another operator[] to enable access to the second dimension), you define operator() to take an arbitrary number of arguments and efficiently perform the complete lookup with no proxies required.

operator() is the name of the function, which is then followed by another pair of parentheses listing the arguments. It is the function-call operator and overloading it allows you to make objects that act like functions/function pointers. In this case, it allows:
BitBoard thing;
thing(i, j); // looks like a function!
In this particular case, it's being used for indexing (like a[i]) but the subscript operator operator[] doesn't allow multiple indexes and the function-call operator does. So it was pretty common to see this for multi-dimensional arrays.
However, the new "preferred" style for multiple indexes in C++ is to pass a list to the subscript operator:
BitBoard thing;
std::cout << thing[{i, j}];
This would be accomplished by operator[](std::array<int, 2> xy).
But the author of this class has chosen the old way, that looks like a function call.
Overloaded operator() is also what makes lambda expressions tick inside.

sorting a pair of vector in c++ by second element and if got same second then sort it according to first [duplicate]

I am using std::sort to sort an array in descending order.
#include <iostream>
#include <algorithm>
#define size 5
using namespace std;
bool descending(int x,int y){
return x>y;
}
int main(){
int a[size]={5,3,7,34,2};
sort(a,a+size,descending);
return 0;
}
This code works. But I'm not sure why.
Shouldn't descending be called with 2 arguments?

Here you are not calling descending() function in your program, you are passing it to std::sort() as a function pointer which accepts a function pointer of type
bool cmp(const Type1 &a, const Type2 &b);
as its third argument.
according to cppreference , the third argument to std::sort is comp which is
comparison function object (i.e. an object that satisfies the requirements of Compare) which returns true if the first argument is less than (i.e. is ordered before) the second.
The signature of the comparison function should be equivalent to the following:
bool cmp(const Type1 &a, const Type2 &b);
While the signature does not need to have const &, the function must not modify the objects passed to it and must be able to accept all values of type (possibly const) Type1 and Type2 regardless of value category (thus, Type1 & is not allowed, nor is Type1 unless for Type1 a move is equivalent to a copy (since C++11)).
The types Type1 and Type2 must be such that an object of type RandomIt can be dereferenced and then implicitly converted to both of them. 
source :
std::sort

On this line sort(a,a+size,descending); the function descending is not being called at all. It is being passed to the function sort. The function sort will call descending and it will use two arguments (which it will get from the array a).
The take-away is that there are two things you can do with a function:
You can call it
a function has a value (its address), which you can pass as a parameter to another function, save in a variable, etc.
There are a few technical details here which I've skipped over. For instance, what is actually being passed in your case is a pointer to the function. But the essential idea is that functions are callable but they are also values that can be passed around just like other values.

Short answer:
Shouldn't descending be called with 2 arguments?
No, because you don't call it. You give it to std::sort, which will then call it, passing it elements from the array as needed.

std::sort uses iterators. This means that when comparing, it dereferences the pointer to compare the actual value. This is what it looks like (when comparing, Reference: cppreference),
if(comp(*(it + n), *it)) { ... }
Once the pointer is dereferenced, in this case int pointers, it gives the int value which the iterator points to. This is why passing in bool descending(int x,int y) as the comparator method works without an issue.

How can a function returning bool and having two int arguments be called without its arguments in c++?

I am using std::sort to sort an array in descending order.
#include <iostream>
#include <algorithm>
#define size 5
using namespace std;
bool descending(int x,int y){
return x>y;
}
int main(){
int a[size]={5,3,7,34,2};
sort(a,a+size,descending);
return 0;
}
This code works. But I'm not sure why.
Shouldn't descending be called with 2 arguments?

Here you are not calling descending() function in your program, you are passing it to std::sort() as a function pointer which accepts a function pointer of type
bool cmp(const Type1 &a, const Type2 &b);
as its third argument.
according to cppreference , the third argument to std::sort is comp which is
comparison function object (i.e. an object that satisfies the requirements of Compare) which returns true if the first argument is less than (i.e. is ordered before) the second.
The signature of the comparison function should be equivalent to the following:
bool cmp(const Type1 &a, const Type2 &b);
While the signature does not need to have const &, the function must not modify the objects passed to it and must be able to accept all values of type (possibly const) Type1 and Type2 regardless of value category (thus, Type1 & is not allowed, nor is Type1 unless for Type1 a move is equivalent to a copy (since C++11)).
The types Type1 and Type2 must be such that an object of type RandomIt can be dereferenced and then implicitly converted to both of them. 
source :
std::sort

On this line sort(a,a+size,descending); the function descending is not being called at all. It is being passed to the function sort. The function sort will call descending and it will use two arguments (which it will get from the array a).
The take-away is that there are two things you can do with a function:
You can call it
a function has a value (its address), which you can pass as a parameter to another function, save in a variable, etc.
There are a few technical details here which I've skipped over. For instance, what is actually being passed in your case is a pointer to the function. But the essential idea is that functions are callable but they are also values that can be passed around just like other values.

Short answer:
Shouldn't descending be called with 2 arguments?
No, because you don't call it. You give it to std::sort, which will then call it, passing it elements from the array as needed.

std::sort uses iterators. This means that when comparing, it dereferences the pointer to compare the actual value. This is what it looks like (when comparing, Reference: cppreference),
if(comp(*(it + n), *it)) { ... }
Once the pointer is dereferenced, in this case int pointers, it gives the int value which the iterator points to. This is why passing in bool descending(int x,int y) as the comparator method works without an issue.

binary_search, find_if and <functional>

std::find_if takes a predicate in one of it's overloaded function. Binders make it possible to write EqualityComparators for user-defined types and use them either for dynamic comparison or static comparison.
In contrast the binary search functions of the standard library take a comparator and a const T& to the value that should be used for comparison. This feels inconsistent to me and could possibly more inefficient as the comparator has to be called with both arguments every time instead of having the constant argument bound to it. While it could be possible to implement std::binary_search in a way to use std::bind this would require all comparators to inherit from std::binary_function. Most code I've seen doesn't do that.
Is there a possible benefit from letting comparators inherit from std::binary_function when using it with algorithms that take a const T& as a value instead of letting me use the binders? Is there a reason for not providing predicate overloads in those functions?

A single-argument predicate version of std::binary_search wouldn't be able to complete in O(log n) time.
Consider the old game "guess the letter I'm thinking of". You could ask: "Is it A?" "Is it B?".. and so on until you reached the letter. That's a linear, or O(n), algorithm. But smarter would be to ask "Is it before M?" "Is it before G?" "Is it before I?" and so on until you get to the letter in question. That's a logarithmic, or O(log n), algorithm.
This is what std::binary_search does, and to do this in needs to be able to distinguish three conditions:
Candidate C is the searched-for item X
Candidate C is greater than X
Candidate C is less than X
A one-argument predicate P(x) says only "x has property P" or "x doesn't have property P". You can't get three results from this boolean function.
A comparator (say, <) lets you get three results by calculating C < X and also X < C. Then you have three possibilities:
!(C < X) && !(X < C) C is equal to X
C < X && !(X < C) C is less than X
!(C < X) && X < C C is greater than X
Note that both X and C get bound to both parameters of < at different times, which is why you can't just bind X to one argument of < and use that.
Edit: thanks to jpalecek for reminding me binary_search uses <, not <=.
Edit edit: thanks to Rob Kennedy for clarification.

They are completely different algorithms: find_if looks linearly for the first item for which the predicate is true, binary_search takes advantage that the range is sorted to test in logarithmic time if a given value is in it.
The predicate for binary_search specifies the function according to which the range is ordered (you'd most likely want to use the same predicate you used for sorting it).
You can't take advantage of the sortedness to search for a value satisfying some completely unrelated predicate (you'd have to use find_if anyway). Note however, that with a sorted range you can do more than just test for existence with lower_bound, upper_bound and equal_range.
The question, what is the purpose of std::binary_function is an interesting one.
All it does is provide typedefs for result_type, first_argument_type and second_argument_type. These would allow the users, given a functor as a template argument, to find out and use these types, e.g
template <class T, class BinaryFunction>
void foo(const T& a, const T& b, BinaryFunction f)
{
//declare a variable to store the result of the function call
typename BinaryFunction::result_type result = f(a, b);
//...
}
However, I think the only place where they are used in the standard library is creating other functor wrappers like bind1st, bind2nd, not1, not2. (If they were used for other purposes, people would yell at you any time you used a function as a functor since it would be an unportable thing to do.)
For example, binary_negate might be implemented as (GCC):
template<typename _Predicate>
class binary_negate
: public binary_function<typename _Predicate::first_argument_type,
typename _Predicate::second_argument_type, bool>
{
protected:
_Predicate _M_pred;
public:
explicit
binary_negate(const _Predicate& __x) : _M_pred(__x) { }
bool
operator()(const typename _Predicate::first_argument_type& __x,
const typename _Predicate::second_argument_type& __y) const
{ return !_M_pred(__x, __y); }
};
Of course, operator() could perhaps just be a template, in which case those typedefs would be unnecessary (any downsides?). There are probably also metaprogramming techniques to find out what the argument types are without requiring the user to typedef them explicitly. I suppose it would somewhat get into the way with the power that C++0x gives - e.g when I'd like to implement a negator for a function of any arity with variadic templates...
(IMO the C++98 functors are a bit too inflexible and primitive compared for example to std::tr1::bind and std::tr1::mem_fn, but probably at the time compiler support for metaprogramming techniques required to make those work was not that good, and perhaps the techniques were still being discovered.)

This is a misunderstanding of the Functor concept in C++.
It has nothing to do with inheritance. The property that makes an object a functor (eligible for passing to any of the algorithms) is validity of the expression object(x) or object(x, y), respectively, regardless whether it is a function pointer or an object with overloaded function call operator. Definitely not inheritance from anything. The same applies for std::bind.
The use of binary functors as comparators comes from the fact that comparators (eg. std::less) are binary functors and it's good to be able to use them directly.
IMHO there would be no gain in providing or using the predicate version you propose (after all, it takes just passing one reference). There would be no (performance) gain in using binders, because it does the same thing as the algorithm (bind would pass the extra argument in lieu of the algorithm).

Using stable_sort() to sort doubles as ints

I have a huge array of ints that I need to sort. The catch here is that each entry in the list has a number of other associated elements in it that need to follow that int around as it gets sorted. I've kind of solved this problem by changing the sorting to sort doubles instead of ints. I've tagged each number before it was sorted with a fractional part denoting that value's original location before the sort, thus allowing me to reference it's associated data and allowing me to efficiently rebuild the sorted list with all the associated elements.
My problem is that I want to sort the double values by ints using the function stable_sort().
I'm referring to this web page: http://www.cplusplus.com/reference/algorithm/stable_sort/
However, since I'm a new programmer, i don't quite understand how they managed to get the sort by ints to work. What exactly am i supposed to put into that third argument to make the function work? (i know i can just copy and paste it and make it work, but i want to learn and understand this too).
Thanks,
-Faken
Edit: Please note that I'm a new programmer who has had no formal programming training. I'm learning as i go so please keep your explanations as simple and as rudimentary as possible.
In short, please treat me as if i have never seen c++ code before.

Since you say you're not familiar with vectors (you really should learn STL containers ASAP, though), I assume you're playing with arrays. Something along these lines:
int a[] = { 3, 1, 2 };
std::stable_sort(&a[0], &a[3]);
The third optional argument f of stable_sort is a function object - that is, anything which can be called like a function by following it with parentheses - f(a, b). A function (or rather a pointer to one) is a function object; other kinds include classes with overloaded operator(), but for your purposes a plain function would probably do.
Now you have your data type with int field on which you want to sort, and some additional data:
struct foo {
int n;
// data
...
};
foo a[] = { ... };
To sort this (or anything, really), stable_sort needs to have some way of comparing any two elements to see which one is greater. By default it simply uses operator < to compare; if the element type supports it directly, that is. Obviously, int does; it is also possible to overload operator< for your struct, and it will be picked up as well, but you asked about a different approach.
This is what the third argument is for - when it is provided, stable_sort calls it every time it needs to make a comparison, passing two elements as the arguments to the call. The called function (or function object, in general) must return true if first argument is less than second for the purpose of sorting, or false if it is greater or equal - in other words, it must work like operator < itself does (except that you define the way you want things to be compared). For foo, you just want to compare n, and leave the rest alone. So:
bool compare_foo_n(const foo& l, const foo& r) {
return l.n < r.n;
}
And now you use it by passing the pointer to this function (represented simply by its name) to stable_sort:
std::stable_sort(&a[0], &a[3], compare_foo_n);

You need to pass the comparison function. Something like this:
bool intCompare(double first, double second)
{
return static_cast<int>(first) < static_cast<int>(second);
}
int main()
{
std::vector<double> v;
v.push_back(1.4);
v.push_back(1.3);
v.push_back(2.1);
v.push_back(1.5);
std::stable_sort(v.begin(), v.end(), intCompare);
return 0;
}
Inside the sort algorithm, to compare the values the comparison function passed by you is used. If you have a more complex data structure and want to sort on a particular attribute of the data structure then you can use this user-defined function to compare the values.

I believe you are talking about this function:
bool compare_as_ints (double i,double j)
{
return (int(i)<int(j));
}
And the function call:
stable_sort (myvector.begin(), myvector.end(), compare_as_ints);
The function compare_as_ints is a normal function but this is being passed to the stable_sort as a function pointer. i.e., the address of the function is being passed which would be used by stable_sort internally to compare the values.
Look at this function pointer tutorial if you are unclear about this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ STL sort() function, binary predicate - c++

See comp in http://www.cplusplus.com/reference/algorithm/sort/ It is the function that does the comparison.

Related

What is this "operator" block of code in c++ class

sorting a pair of vector in c++ by second element and if got same second then sort it according to first [duplicate]

How can a function returning bool and having two int arguments be called without its arguments in c++?

binary_search, find_if and <functional>

Using stable_sort() to sort doubles as ints

Categories

Resources