Mechanism of using sets as keys to map - c++

I have this code which I do not understand why it works:
map<set<int>,int> states;
set<int> s1 = {5,1,3}, s2 = {1,5,3};
states[s1] = 42;
printf("%d", states[s2]); // 42
The output is 42, so the values of the states key are used for comparison somehow. How is this possible? I'd expect this not to work as in the similar example:
map<const char*,int> states;
char s1[]="foo", s2[]="foo";
states[s1] = 42;
printf("%d",states[s2]); // not 42
Here the address of the char pointer is used as the key, not the value of the memory where it points, right? Please explain what's the difference between these two samples.
Edit: I've just found something about comparison object which explains a lot. But how is the comparison object created for sets? I can't see how it could be the default less object.

You were up to something with the comparison object.
One of the template parameters for a C++ map defines what acts as the comparison predicate, by default std::less<Key>, in this case std::less<set<int>>.
From cplusplus.com:
The map object uses this expression to determine both the order the elements follow in the container and whether two element keys are equivalent (by comparing them reflexively: they are equivalent if !comp(a,b) && !comp(b,a)). No two elements in a map container can have equivalent keys.
std::less:
Binary function object class whose call returns whether the its first argument compares less than the second (as returned by operator <).
std::set::key_comp:
By default, this is a less object, which returns the same as operator<.
Now, what does the less-than operator do?:
The less-than comparison (operator<) behaves as if using algorithm lexicographical_compare, which compares the elements sequentially using operator< in a reciprocal manner (i.e., checking both a<b and b<a) and stopping at the first occurrence.
Or from MSDN:
The comparison between set objects is based on a pairwise comparison of their elements. The less-than relationship between two objects is based on a comparison of the first pair of unequal elements.
So, since both sets are equivalent, because sets are ordered by key, using either as key refers to the same entry in the map.
But the second example uses a pointer as key, so the two equivalent values aren't equal as far as the map goes because the operator< in this case isn't defined in any special way, it's just a comparison between addresses. If you had used a std::string as key, they would have matched, though (because they do have their own operator<).

A few reasons for this behavior:
Sets are kept sorted as you input values, so s1 and s2 actually look the same if u print it out.
Sets have overloaded operators so that s1 == s2 compares the contents instead
the second example with *char explicitly put the pointer (address) as the key
declaring the static string gives u different addresses for two instances of "foo"
if you put s2 = s1 instead instead you should get the same behavior

looking at the documentation on cppreference.com:
template<
class Key,
class T,
class Compare = std::less<Key>,
class Allocator = std::allocator<std::pair<const Key, T> >
> class map;
so if you don't supply your own custom comparison function, it orders the elements by the < operator, and:
operator==,!=,<,<=,>,>=(std::set)...
Compares the contents of lhs and rhs
lexicographically. The comparison is
performed by a function equivalent to
std::lexicographical_compare.
so when you call < on two sets it compares their sorted elements lexicographically. your two sets, when sorted, are exactly the same.

When you do:
cout << (s1 == s2);
You get an output of 1. I don't know how sets in C++ work; I don't even know what they are. I do know that the == comparison operator does not return whether not the values of the sets are in the same order. Likely, it returns whether or not the sets contain the same values, regardless of order.
Edit: Yeah, the sets are sorted. So when you create them, they get sorted, which means they become the same thing. Use a std::vector or a std::array
Edit #2: They do become equal. Let me create an analogy.
int v1 = 4 + 1;
int v2 = 1 + 4;
Now obviously, v1 and v2 would be the same value. Sets are no different; once created, the contents are sorted, and when you compare using the == operator, which is how partially how std::map identifies mapped elements, it returns "yes, they are equal". That is why your code works how it works.

Related

Why is Map[2] updating a wrong key data? Is this a proper way of doing this?

#include <iostream>
using namespace std;
struct ls{
bool operator()(int lhs, int rhs){
return lhs == rhs;
}
};
int main(){
map<int,string,ls> m1 {{1,"A"},{2,"B"}};
map<int,string>::iterator i;
for(i=m1.begin();i!=m1.end();++i) {
cout<<i->first<<" - "<<i->second<<endl;
}
//If we print data here only 1, "A" data is present.
m1[2] = "C";
for(i=m1.begin();i!=m1.end();++i) {
cout<<i->first<<" - "<<i->second<<endl;
}
//the above statement updates m1[1] as "C" even though we are m1[2]
}
The problem is that you are not respecting the contract of std::map third template argument which should be the comparison function.
The comparison function, which defaults to std::less<T>, must provide a total ordering on the keys of your std::map. For this purpose the ISO standard defines that, for associative containers at §23.2.4.3:
The phrase “equivalence of keys” means the equivalence relation imposed by the comparison and not the operator== on keys. That is, two keys k1 and k2 are considered to be equivalent if for the comparison object comp, comp(k1, k2) == false && comp(k2, k1) == false. For any two keys k1 and k2 in the same container, calling comp(k1, k2) shall always return the same value.
Now in your situation you define the comparison as lhs == rhs which means that
auto b1 = ls{}(1, 2);
auto b2 = ls{}(2, 1};
are both false, so both keys are considered an unique key (if a is not less than b and b is not less than a then a must be equal to b). This means that on map construction only the first pair is inserted.
But then with m1[2] = "C", since you are getting the reference to the value mapped to 2 and 2 compares equal to 1 according to your function, you update the only key present.
Your ls template argument is wrong. std::map requires comparison to be implemented via strict weak ordering. As §23.2.4/2 of the ISO C++ standard says:
Each associative container is parameterized on Key and an ordering
relation Compare that induces a strict weak ordering (...) on
elements of Key.
See also http://en.cppreference.com/w/cpp/concept/Compare.
Among other things, this means x cannot be less than itself, i.e. x < x must be false.
Your ls functor, however, does exactly that. When lhs is 1 and rhs is 1, then true is returned. The fact that this is incorrect should not come as a surprise; it's really all just a very technical, formal way of explaining what the English expression "something is less than something else" actually means in terms of mathematics or computer science.
In any case, since your code does not meet the requirements of std::map, your program has undefined behaviour.
The solution is simple: Just don't use ls. Instantiate your map as std::map<int, std::string> and it will work, because the default argument is an instantation of std::less, which has the correct behaviour.

Why greater<int> and less<int> showing opposite behaviour?

Please consider following code,
using namespace std;
std::priority_queue<int,vector<int>,std::greater<int>> queue; //first
queue.push(26);
queue.push(12);
queue.push(22);
queue.push(25);
std::cout<<queue.top()<<endl;
std::priority_queue<int,vector<int>,std::less<int>> queue2; //second
queue2.push(26);
queue2.push(12);
queue2.push(22);
queue2.push(25);
std::cout<<queue2.top()<<endl;
Output:
12
26
In first definition I used greater<int> still I am getting 12 (min value) as output, while when I use less<int> I get 26 (max value).
Shouldn't greater<int> create max heap?
As far as the internal algorithm itself is concerned, std::priority_queue always creates "max heap". You just need to teach it to compare the elements in order for it know what's "max".
To determine the ordering for that "max heap", it uses a "less"-style comparison predicate: when given a pair (a, b) (in that specific order) the predicate should tell whether a is less than b. By using the ordering information obtained from the predicate std::priority_queue will make sure that the greater element is at the top of the heap. Standard std::less is an example of such predicate. Whatever predicate you supply, the implementation will treat it as a "less"-style predicate.
If you supply a predicate that implements the opposite comparison (like std::greater), you will naturally end up with minimum element at the top. Basically, one can put it this way: std::priority_queue expects a "less" comparison predicate, and by supplying a "greater" comparison predicate instead you are essentially deceiving the queue (in a well-defined fashion). The primary consequence of that deceit is that for you (the external observer) "max heap" turns into a "min heap".
And this is exactly what you observe.
Because that's their job. less and greater are supposed to model operators < and > respectively and are used by priority_queue to give order to its elements.
They yield opposite results because they're defined to do so (except for equal elements).
Shouldn't greater<int> create max heap?
You're mistaking internal representation of the container with the interface of top() member function, which is supposed to yield the top element, as per the comparator.
std::priority_queue is a "max heap". You provide a less-than operator; and the element at the top is the largest.
In your second example, you provided less-than to be the intuitive std::less; and you see the largest element at the top.
In your first example, you consider a larger int to be "less-than" a smaller int; and the "largest" element based your "less-than" is in fact the smallest int.

Basic std set logic

This may be dull question, but I want to be sure.
Lets say I have a struct:
struct A
{
int number;
bool flag;
bool operator<(const A& other) const
{
return number < other.number;
}
};
Somewhere in code:
A a1, a2, a3;
std::set<A> set;
a1.flag = true;
a1.number = 0;
a2.flag = false;
a2.number = 10;
a3 = a1;
set.insert(a1);
set.insert(a2);
if(set.find(a3) == set.end())
{
printf("NOT FOUND");
}
else
{
printf("FOUND");
}
The output I get is "FOUND". I understand that, since I am passing values, elements in set are compared by value. But how can objects A be compared by their values, since equality operator is not overrided? I dont understand how overriding operator '<' can be enough for sets finding function.
The ordered containers (set, multiset, map, multimap) use one single predicate to establish the element order and find values, the less-than predicate.
Two elements are considered equal if neither one is less-than the other.
This notion if "equality" may not be the same as some other notion of equality you may have. Sometimes the term "equivalent" is preferred to distinguish this notion that's induced by the less-than ordering from other, ad-hoc notions of equality that may exist simultaneously (e.g. an overloaded operator==).
For "sane" value types (also called regular types), ad-hoc equality and less-than-induced equivalence are required to be the same; many naturally occurring types are regular (e.g. arithmetic types (if NaNs are removed)). In other cases, especially if the less-than predicate is provided externally and not by the type itself, it's entirely possible that the less-than equivalence classes contain many non-"equal" values.
The flag member is entirely irrelevant here. The set has found an element that is equivalent to the searched-for value, with respect to <.
That is, if a is not less than b, and b is not less than a, then a and b must be equal. This is how it works with normal integers. That is how it is decided 2 values are equivalent in a std::set.
std::set doesn't use == at all. (unordered_set, which is a hash set, does use it, because it's the only way to distinguish hash collisions).
You can also provide a function to do the work of <, but it must behave as a strict weak ordering. Which is a bit heavy on the maths, but basically you could use > instead, via std::greater, or define your own named function rather than defining operator<.
So there is nothing technically to stop you defining an operator== that behaves differently from the notion of equivalence that comes from your operator<, but std::set won't use it, and it would probably confuse people.
From the documentation of set
two objects a and b are considered equivalent (not unique) if neither
compares less than the other: !comp(a, b) && !comp(b, a)
http://en.cppreference.com/w/cpp/container/set
In the template you can see
template<
class Key,
class Compare = std::less<Key>,
class Allocator = std::allocator<Key>
> class set;
std::less
will call operator< and that is why it works.

What does same 'value' mean for a std::set?

In C++, the std::set::insert() only inserts a value if there is not already one with the same 'value'. By the same, does this mean operator== or does it mean one for which operator< is false for either ordering, or does it mean something else?
does it mean one for which operator< is false for either ordering?
Yes, if the set uses the default comparator and compares keys using <. More generally, in an ordered container with comparator Compare, two keys k1 and k2 are regarded as equivalent if !Compare(k1,k2) && !Compare(k2,k1).
Keys are not required to implement operator== or anything else; they are just required to be comparable using the container's comparator to give a strict weak ordering.
std::set has a template argument called `Compare' as in this signature:
template < class Key, class Compare = less<Key>,
class Allocator = allocator<Key> > class set;
Compare is used to determine the ordering between elements. Here, the default less<Key> uses the < operator to compare two keys.
If it helps, you can think of a set as just a std::map with meaningless values, ie a std::set<int> can be thought of as a std::map<int, int> where the values are meaningless.
The only comparison that set is allowed to perform on T is via the functor type it was given to do comparisons as part of the template. Thus, that's how it defines equivalence.
For every value in the set, the comparison must evaluate to true for one of the two ordering between that value and the new one. If it's false both ways for any value, then it won't be stored.

When are two elements of an STL set considered identical?

From cplusplus.com:
template < class Key, class Compare = less<Key>,
class Allocator = allocator<Key> > class set;
"Compare: Comparison class: A class that takes two arguments of the same type as the container elements and returns a bool. The expression comp(a,b), where comp is an object of this comparison class and a and b are elements of the container, shall return true if a is to be placed at an earlier position than b in a strict weak ordering operation. This can either be a class implementing a function call operator or a pointer to a function (see constructor for an example). This defaults to less, which returns the same as applying the less-than operator (a<b).
The set object uses this expression to determine the position of the elements in the container. All elements in a set container are ordered following this rule at all times."
Given that the comparison class is used to decide which of the two objects is "smaller" or "less", how does the class check whether two elements are equal (e.g. to prevent insertion of the same element twice)?
I can imagine two approaches here: one would be calling (a == b) in the background, but not providing the option to override this comparison (as with the default less<Key>)doesn't seem too STL-ish to me. The other would be the assumption that (a == b) == !(a < b) && !(b < a) ; that is, two elements are considered equal if neither is "less" than the other, but somehow this doesn't feel right to me either, considering that the comparison can be an arbitrarily complex bool functor between objects of an arbitrarily complex class.
So how is it really done?
Not an exact duplicate, but the first answer here answers your question
Your second guess as to the behaviour is correct
Associative containers in the standard library are defined in terms of equivalence of keys, not equality per se.
As not all set and map instances use less, but may use a generic comparison operator it's necessary to define equivalence in terms of this one comparison function rather then attempting to introduce a separate equality concept.
In general, two keys (k1 and k2) in an associative container using a comparison function comp are equivalent if and only if:
comp( k1, k2 ) == false && comp( k2, k1 ) == false
In a container using std::less for types that don't have a specific std::less specialization, this means the same as:
!(k1 < k2) && !(k2 < k1)
Your mistake is the assumption that "the comparison can be an arbitrarily complex bool functor". It can't.
std::set requires a partial ordering so that a<b implies !(b<a). This excludes most binary boolean functors. Because of that, we can talk about the relative position of a and b in that ordering. If a<b, a precedes b. If b<a , b precedes a. If neither a<b nor b<a, then a and b occupy the same position in the ordering and thus are equivalent.
Your second option is the right one. Why doesn't feel it right? What would you do if the equality test wasn't consistent with the equation you give?