namespace std overloading less than - c++

I was curious about why this piece of code doesn't work:
#include "stdafx.h"
#include <iostream>
#include <tuple>
#include <string>
#include <vector>
#include <algorithm>
typedef std::tuple<int, std::string> intString;
bool operator<(intString& lhs, intString& rhs){
return std::tie(std::get<1>(lhs), std::get<0>(lhs)) < std::tie(std::get<1>(rhs), std::get<0>(rhs));
}
void printIntStrings(std::vector<intString>& v){
for (intString& i : v){
std::cout << std::get<0>(i) << " is " << std::get<1>(i) << std::endl;
}
}
int main(int argc, char* argv[])
{
std::vector<intString> v;
v.push_back(std::make_tuple(5, "five"));
v.push_back(std::make_tuple(2, "two"));
v.push_back(std::make_tuple(9, "nine"));
printIntStrings(v);
std::sort(v.begin(), v.end());
printIntStrings(v);
return 0;
}
As far as I can understand, I simply create a vector of intStrings and my operator should sort by the second element in the tuple first thus the output should be (last 3 lines anyway)
5 five
9 nine
2 two
However running it on my machine I get
2 two
5 five
9 nine
which implies that the sort is using the default less than operator, ignoring the one I specified. Note, adding const before the parameters didn't seem to affect anything.
I found three ways to "fix" this.
Fix #1
surround bool operator< ... in namespace std like so:
namespace std{
bool operator<(intString& lhs, intString& rhs){
return std::tie(std::get<1>(lhs), std::get<0>(lhs)) < std::tie(std::get<1>(rhs), std::get<0>(rhs));
}
}
However I was told we should never add things to the std namespace since that behavior is undefined, so this fix seems the worst.
Fix #2
Add in something custom to the tuple like so:
enum class TRASH{DOESNTMATTER};
typedef std::tuple<int, std::string, TRASH> intString;
bool operator<(intString& lhs, intString& rhs){
return std::tie(std::get<1>(lhs), std::get<0>(lhs)) < std::tie(std::get<1>(rhs), std::get<0>(rhs));
}
(and obviously add in TRASH::DOESNTMATTER as the third make_tuple argument)
However, this seemed like a lot of work for something this simple. Also, it seems wasteful since the enum is not meaningfully used.
Fix #3
Use the predicate sort like so:
std::sort(v.begin(), v.end(), operator<);
This seemed to be the most elegant solution. However, I don't see why I have to explicitly tell the compiler to use my defined operator<.
So I want to know:
1) why this happens? Shouldn't c++ find my implementation and use that?
2) which "fix" is the best? if none of the ones I found, what would you recommend?
Any ideas? Thanks for reading!

Your operator< overload is not visible at the point where < is used (which is in the body of std::sort and/or any helper functions called by it, somewhere in <algorithm>).
If it is to be used, it must be picked up by argument-dependent lookup; but there's nothing in std::tuple<int, std::string> that has the global namespace as an associated namespace, so ADL doesn't help you, either, and the standard one is used.
Pass it as a comparator, preferably using a lambda or function object (which inlines better than function pointers), is the simplest fix. I'd also recommend renaming it; having a operator< overload with completely different semantics than the standard one, which may or may not be used by the expression a < b depending on where that expression is, is not a good idea.

you already fix it by your self
the problem is your operator< function doesn't override the default tuple::operator<, they are in different namespace
so, both your Fix#1 and Fix#3 are good solution
Fix#1 put them into the same namespace make it override correct,I think is the best way

Related

How do I insert into boost::unordered_set<boost::unordered_set<int> >?

The following code fails to compile, but if I remove the commented line, it compiles and runs correctly. I was only intending to use boost because C++ doesn't provide a hash function for std::unordered_set<int> by default.
#include <iostream>
#include <boost/unordered_set.hpp>
int main() {
boost::unordered_set<boost::unordered_set<int> > fam;
boost::unordered_set<int> s;
s.insert(5);
s.insert(6);
s.insert(7);
std::cout << s.size() << std::endl;
fam.insert(s); // this is the line causing the problem
return 0;
}
Edit 1:
I want to be more clear than I was in the OP. First I know that the idea of the boost::unordered_set<> is that it is implemented with a hash table, rather than a BST. I know that anything that is to be a template type to the boost::unordered_set<> needs to have a hash function and equality function provided. I also know that by default the std::unordered_set<> does not have a hash function defined which is why the following code does not compile:
#include <iostream>
#include <unordered_set>
int main() {
std::unordered_set<std::unordered_set<int> > fam;
return 0;
}
However, I thought that boost provides hash functions for all their containers which is why I believed the following code does compile:
#include <iostream>
#include <boost/unordered_set.hpp>
int main() {
boost::unordered_set<boost::unordered_set<int> > fam;
return 0;
}
So now, I'm not sure why boost code just above compiles, but the code in the OP does not. Was I wrong that boost provides a hash function for all their containers? I would really like to avoid having to define a new hash function, especially when my actual intended use is to have a much more complicated data structure: boost::unordered_map<std::pair<boost::unordered_map<int, int>, boost::unordered_map<int, int> >, int>. It seems like this should be a solved problem that I shouldn't have to define myself, since IIRC python can handle sets of sets no problem.
An unordered_set (or _map) uses hashing, and requires a hash operator to be defined for its elements. There is no hash operator defined for boost::unordered_set<int>, therefore it cannot put such a type of element into your set.
You may write your own hash function for this. For example, this is a typical generic hash approach, though you may want to customize it for your particular data. If you drop this code into your example, it should work:
namespace boost {
std::size_t hash_value(boost::unordered_set<int> const& arg) {
std::size_t hashCode = 1;
for (int e : arg)
hashCode = 31 * hashCode + hash<int>{}(e);
return hashCode;
}
}

How does modern compiler optimize function object in c++?

As I knew from the book Effective C++, it would have a better performance if I pass a Function Object by its value rather than function reference or function pointer in C++. So how does the modern compiler do to optimize that kind of scenario?
Or let's say usually we do not recommend to pass an object of our self-customized class by value, but as function object is actually the same as a normal object but just implemented the "operator()" inside the class. So, there must be something different for the compiler to treat these two things when passing them by value, right?
Below is a case giving a comparison between the function object and function pointer.
#include <algorithm>
#include <vector>
#include <ctime>
#include <iostream>
bool cmp(int a, int b) { return a < b; }
int main() {
std::vector<int> v(10000000);
for (size_t i = 0; i < 10000000; ++i)
v.push_back(rand());
std::vector<int> v2(v);
std::sort(v.begin(), v.end(), std::less<int>()); // This way would be faster than below;
std::sort(v2.begin(), v2.end(), cmp);
}
In case of function pointer, compilers is likely to pass function pointer and performing indirect function call, instead of making direct function call or even inlining.
In contrast, operator() of a function object is likely to inline, or at least be called directly, since it is not passed, only data to it is passed (by value or by reference). In case of function object without data, you pass nothing (that would compile to a dummy integer, or even nothing).
Especially it is true with std::function, there's almost no way from implementation side to avoid double indirect function call in case of function pointer.
A lambda is easiest way to make this optimization. Here is your example with one character difference:
#include <algorithm>
#include <vector>
#include <ctime>
#include <iostream>
int main() {
std::vector<int> v(10000000);
for (size_t i = 0; i < 10000000; ++i)
v.push_back(rand());
std::vector<int> v2(v);
std::sort(v.begin(), v.end(), [] (int a, int b) { return a < b; }); // This way would be faster than below;
std::sort(v2.begin(), v2.end(), +[] (int a, int b) { return a < b; });
}
Modern compilers did not go much further than old compilers in this regard. Although you can try your example on different modern compilers to check for sure (you can use https://godbolt.org/ and inspect disassembly)
In case of gcc 7.5, std::sort uses internally __gnu_cxx::__ops::_Iter_comp_iter template which looks like that:
template<typename _Compare>
struct _Iter_comp_iter
{
_Compare _M_comp;
explicit _GLIBCXX14_CONSTEXPR
_Iter_comp_iter(_Compare __comp) : _M_comp(_GLIBCXX_MOVE(__comp)) { }
template<typename _Iterator1, typename _Iterator2>
_GLIBCXX14_CONSTEXPR bool
operator()(_Iterator1 __it1, _Iterator2 __it2)
{ return bool(_M_comp(*__it1, *__it2)); }
}
In the first case _Compare is std::less<int>, in the second -- bool (*)(int, int).
In the first case gcc inlines comparison, while in the second it generates something like callq *%r13 to call that pointer stored in _M_comp.
Update:
After more digging around prompted by comments, it turns out that the problem is not in the type of _Compare -- gcc 7.5 can inline small pure functions with function pointers, too, even without inline modifier -- but rather in presence of recursion in the internal workings of std::sort. That throws the compiler off and it generates indirect call. Good news is that gcc 8+ seems to be free of this drawback.

c++ syntax to find min in a list

I see some complicated discussion on this topic but I'm failing to make what should be a very simple piece of code work. I've found two ways to make it work but I'm puzzled by the some issues of syntax. I just want to find the minimum value in a list. This works:
using namespace std;
#include <iostream>
#include <list>
#include <algorithm>
int main(void)
{
auto init_list = {10, 20, 30};
list<int> alist(init_list);
list<int>::iterator minnum = min_element(begin(alist), end(alist));
cout << "The min in the list is: " << *minnum << endl;
return(1);
}
But there is also the definition of min() which takes an initializer_list as an argument. I think I am misunderstanding the syntax of it. Since I've already defined an init_list I can call
int minint = min(init_list);
and that works. But I don't understand why. The documentation on min says this call should be of form:
template< class T, class Compare >
T min( std::initializer_list<T> ilist, Compare comp );
So shouldn't I have to specify a < comparison operator to be able to use this (apparently not...why?)? However, I don't know how to pass this operator correctly, and have been unable to come up with a clear explanation. Each of the following generate a wide variety of syntax errors:
int minint = min(init_list, operator<);
int minint = min(init_list, int::operator<);
How am I supposed to pass the operator? Clearly it doesn't matter in this case but if my list was not a list of ints but a list of some custom type with overloaded operators then would I need to pass an operator< ? Or does the syntax simply mean that class T must have a comparison operator defined, but you never have to specify it in the min function call?
There is version of min (third declaration) which takes only initialization list.
And if you want to be explicit about comparing with < relation, use std::less:
int minint = min(init_list, std::less<int>());
Thanks Tomasz Klak. I'm pretty new to operator overloading so I didn't know the std::less syntax. That's very useful to know. Clearly I wasn't reading carefully enough, since I missed that other definition of min(). My other question was about how this works with custom objects. I think I've answered that for myself now. Here's what I've found.
You can make an initialization list of custom objects. For example, I have a very simple class that I call Double_list_struct, which is just a container for an int (the index) and a double (the value), with some typical member functions (constructors, getters, setters, toString() and overloaded operators, including operator<). So I can make an initialization list of it by:
Double_list_struct doub1(1,5.5);
Double_list_struct doub2(1,3.5);
Double_list_struct doub3(1,26.5);
auto init_list_doub = {doub1, doub2, doub3};
Having done this the min() function works on it perfectly well.
Double_list_struct mindoub = min(init_list_doub);
I assume (but haven't checked) that if the Double_list_struct class hadn't included an operator< function I would have had to provide one.
Cheers!

add subtract multiply and divide template functions

I am trying to find template functions that do:
template <typename T>
T add(T lhs, T rhs) {
return lhs + rhs;
}
(for add, subtract, multiply, and divide).
I remember there being a standard set of functions for this-- do you remember what they are?
In the header <functional>, you'll find things like std::plus, std::minus, std::multiplies, and std::divides.
They're not functions, either. They're actually functors.
You need functors such as std::plus from the <functional> header. See Arithmetic operations here.
These are functors, not functions, so you need an instance to do anything useful:
#include <functional>
#include <iostream>
int main() {
std::multiplies<int> m;
std::cout << m(5,3) << "\n";
}
This seems like overkill in the above sample, but they are pretty useful with standard library algorithms. For example, find the product of elements in a vector:
std::vector<int> v{1,2,3,4,5,6};
int prod = std::accumulate(v.begin(), v.end(), 1, std::multiplies<int>());

Literate Coding Vs. std::pair, solutions?

As most programmers I admire and try to follow the principles of Literate programming, but in C++ I routinely find myself using std::pair, for a gazillion common tasks. But std::pair is, IMHO, a vile enemy of literate programming...
My point is when I come back to code I've written a day or two ago, and I see manipulations of a std::pair (typically as an iterator) I wonder to myself "what did iter->first and iter->second mean???".
I'm guessing others have the same doubts when looking at their std::pair code, so I was wondering, has anyone come up with some good solutions to recover literacy when using std::pair?
std::pair is a good way to make a "local" and essentially anonymous type with essentially anonymous columns; if you're using a certain pair over so large a lexical space that you need to name the type and columns, I'd use a plain struct instead.
How about this:
struct MyPair : public std::pair < int, std::string >
{
const int& keyInt() { return first; }
void keyInt( const int& keyInt ) { first = keyInt; }
const std::string& valueString() { return second; }
void valueString( const std::string& valueString ) { second = valueString; }
};
It's a bit verbose, however using this in your code might make things a little easier to read, eg:
std::vector < MyPair > listPairs;
std::vector < MyPair >::iterator iterPair( listPairs.begin() );
if ( iterPair->keyInt() == 123 )
iterPair->valueString( "hello" );
Other than this, I can't see any silver bullet that's going to make things much clearer.
typedef std::pair<bool, int> IsPresent_Value;
typedef std::pair<double, int> Price_Quantity;
...you get the point.
You can create two pairs of getters (const and non) that will merely return a reference to first and second, but will be much more readable. For instance:
string& GetField(pair& p) { return p.first; }
int& GetValue(pair& p) { return p.second; }
Will let you get the field and value members from a given pair without having to remember which member holds what.
If you expect to use this a lot, you could also create a macro that will generate those getters for you, given the names and types: MAKE_PAIR_GETTERS(Field, string, Value, int) or so. Making the getters straightforward will probably allow the compiler to optimize them away, so they'll add no overhead at runtime; and using the macro will make it a snap to create those getters for whatever use you make of pairs.
You could use boost tuples, but they don't really alter the underlying issue: Do your really want to access each part of the pair/tuple with a small integral type, or do you want more 'literate' code. See this question I posted a while back.
However, boost::optional is a useful tool which I've found replaces quite a few of the cases where pairs/tuples are touted as ther answer.
Recently I've found myself using boost::tuple as a replacement for std::pair. You can define enumerators for each member and so it's obvious what each member is:
typedef boost::tuple<int, int> KeyValueTuple;
enum {
KEY
, VALUE
};
void foo (KeyValueTuple & p) {
p.get<KEY> () = 0;
p.get<VALUE> () = 0;
}
void bar (int key, int value)
{
foo (boost:tie (key, value));
}
BTW, comments welcome on if there is a hidden cost to using this approach.
EDIT: Remove names from global scope.
Just a quick comment regarding global namespace. In general I would use:
struct KeyValueTraits
{
typedef boost::tuple<int, int> Type;
enum {
KEY
, VALUE
};
};
void foo (KeyValueTuple::Type & p) {
p.get<KeyValueTuple::KEY> () = 0;
p.get<KeyValueTuple::VALUE> () = 0;
}
It does look to be the case that boost::fusion does tie the identity and value closer together.
As Alex mentioned, std::pair is very convenient but when it gets confusing create a structure and use it in the same way, have a look at std::pair code, it's not that complex.
I don't like std::pair as used in std::map either, map entries should have had members key and value.
I even used boost::MIC to avoid this. However, boost::MIC also comes with a cost.
Also, returning a std::pair results in less than readable code:
if (cntnr.insert(newEntry).second) { ... }
???
I also found that std::pair is commonly used by the lazy programmers who needed 2 values but didn't think why these values where needed together.