I am coding in a mixed C/C++ environment. I have an struct in C part and I would like to collect it in a map container in C++ part.
I think I should define a custom key_compare function object, and let STL map::insert() orders nodes. However I don't know how can I modify map container to customize map::find() function. I am looking for a way to customize map::find() function to do something more that key_compare function for equivalence checking.
Would you please let me know how I can put these functions into STL::map or STL::set ?
here is my struct in C part (compile with gcc):
typedef struct iotrace_arh_node
{
double time;
unsigned long long int blkno;
int bcount;
u_int flags;
int devno;
unsigned long stack_no;
} iotrace_arh_node_t;
here is my proposed key_compare and equivalence checking function for find() in the C++ part (compile with g++):
int key_compare ( struct iotrace_arh_node tempa, struct iotrace_arh_node tempb )
{
return (tempa.blkno-tempb.blkno);
}
int key_equal( struct iotrace_arh_node tempa, struct iotrace_arh_node tempb )
{
if( (tempa.blkno == tempb.blkno) && (tempa.bcount == tempb.bcount) )
return 0; // tempa and tempb is equal, node fund in the map
else if ( (tempb.blkno < tempa.blkno) )
return -1; //tempb is less than tempa
else if ( (tempb.blkno >= tempa.blkno) && ( tempb.blkno + tempb.bcount < tempa.blkno + tempa.bcount) )
return 0; // tempa and tempb is equal, node fund in the map
else
return 1; //tempb is grater than tempa
}
To use the type as the key in a map or set, you need to provide a "less-than" comparison, which takes two arguments and returns true if the first should come before the second. The easiest way to use it in a set is to define it as a function object:
struct key_compare {
bool operator()(const iotrace_arh_node & a, const iotrace_arh_node & b) {
return a.blkno < b.blkno;
}
};
and use it as the "comparator" template argument in the map or set:
typedef std::set<iotrace_arh_node, key_compare> node_set;
If you need different ways of comparing the keys, then you can create different sets with different comparators. However, you can't change the comparator once the set is created; the objects in the set are stored according to the order defined by the comparator, so changing it would make the set unusable. If you need to search the same set by different fields, then have a look at Boost.MultiIndex
You don't need to provide an equality comparison.
Standard compare functions differ in C and C++. In C, as you written, you return -1, 0 or 1 when first argument is lesser, equal or greater than the second one. But in C++ you should either overload the < operator, or write a compare function which does the same as < operator and give its name to STL functions. But you should make sure that your < should be transitive (i.e. a<b && b<c => a<c) It means that your key_compare function should be like:
bool key_compare ( const struct iotrace_arh_node& tempa, const struct iotrace_arh_node& tempb )
{
return (tempa.blkno < tempb.blkno);
}
There is no need to define key_equal, because (k1 == k2) <=> (!(k1<k2)&&!(k2<k1)). And AFAIK you can not use different compare functions when you insert and find.
For the comparer, see here: STL Map with custom compare function object
struct my_comparer
{
bool operator() ( const struct iotrace_arh_node& left, const struct iotrace_arh_node& right )
{
return left.blkno < rigth.blkno);
}
}
The comparer must be a binary predicate, not a simple function.
Then you can use it in a Map:
std::map<Key, Data, Compare, Alloc>
(look here: http://www.cplusplus.com/reference/stl/map/ )
Compare and Alloc have default values.
What is your key-type, btw?
hth
Mario
Related
I want to use a map to count pairs of objects based on member input vectors. If there is a better data structure for this purpose, please tell me.
My program returns a list of int vectors. Each int vector is the output of a comparison between two int vectors ( a pair of int vectors). It is, however, possible, that the output of the comparison differs, though the two int vectors are the same (maybe in different order). I want to store how many different outputs (int vectors) each pair of int vectors has produced.
Assuming that I can access the int vector of my object with .inp()
Two pairs (a1,b1) and (a2,b2) should be considered equal, when (a1.inp() == a2.inp() && b2.inp() == b1.inp()) or (a1.inp() == b2.inp() and b1.inp() == a2.inp()).
This answer says:
The keys in a map a and b are equivalent by definition when neither a
< b nor b < a is true.
class SomeClass
{
vector <int> m_inputs;
public:
//constructor, setter...
vector<int> inp() {return m_inputs};
}
typedef pair < SomeClass, SomeClass > InputsPair;
typedef map < InputsPair, size_t, MyPairComparator > InputsPairCounter;
So the question is, how can I define equivalency of two pairs with a map comparator. I tried to concatenate the two vectors of a pair, but that leads to (010,1) == (01,01), which is not what I want.
struct MyPairComparator
{
bool operator() (const InputsPair & pair1, const InputsPair pair2) const
{
vector<int> itrc1 = pair1.first->inp();
vector<int> itrc2 = pair1.second->inp();
vector<int> itrc3 = pair2.first->inp();
vector<int> itrc4 = pair2.second->inp();
// ?
return itrc1 < itrc3;
}
};
I want to use a map to count pairs of input vectors. If there is a better data structure for this purpose, please tell me.
Using std::unordered_map can be considered instead due to 2 reasons:
if hash implemented properly it could be faster than std::map
you only need to implement hash and operator== instead of operator<, and operator== is trivial in this case
Details on how implement hash for std::vector can be found here. In your case possible solution could be to join both vectors into one, sort it and then use that method to calculate the hash. This is straightforward solution, but can produce to many hash collisions and lead to worse performance. To suggest better alternative would require knowledge of the data used.
As I understand, you want:
struct MyPairComparator
{
bool operator() (const InputsPair& lhs, const InputsPair pair2) const
{
return std::minmax(std::get<0>(lhs), std::get<1>(lhs))
< std::minmax(std::get<0>(rhs), std::get<1>(rhs));
}
};
we order the pair {a, b} so that a < b, then we use regular comparison.
I have these structures:
typedef std::pair<unsigned int, std::pair<int, int> > myPair;
typedef std::set< myPair> Graph;
Graph g;
What is the right comparison function for sorting the graph like?
std::sort(g.begin(), g.end(), Cmp());
I tried doing something like this:
struct Cmp{
bool operator()(const myPair& l, const myPair& r)const{
return l.second.second < r.second.second;
}
};
I want the set to be ordered according to the second element of the most inner pair. Is it possible?
a = (std::make_pair(0,std::make_pair(1,1)));
b = (std::make_pair(0,std::make_pair(1,2)));
c = (std::make_pair(0,std::make_pair(1,0)));
d = (std::make_pair(1,std::make_pair(2,0)));
The result would be:
Before the ordering
c = (0,(1,0)), a = (0,(1,1)), b = (0,(1,2)), d = (1,(2,0)
After the ordering
c = (0,(1,0)), d = (1,(2,0), a = (0,(1,1)), b = (0,(1,2))
Question: Is it possible to create the set in this ordering manner?
Your comparison function is a good start. What is missing is "tie resolution". You need to specify what happens when l.second.second == r.second.second, and also what happens when l.second.first == r.second.first:
bool operator()(const myPair& l, const myPair& r)const{
return (l.second.second < r.second.second) ||
((l.second.second == r.second.second) && (l.second.first < r.second.first)) ||
((l.second.second == r.second.second) && (l.second.first == r.second.first) && (l.first < r.first)).
}
The first condition comes from your implementation.
The second condition tells what to do when the first-priority items are equal to each other
The third condition tells what to do when both the first-priority and the second-priority items are equal to each other.
In order to use this function for ordering your set, you need to pass it as the second template parameter to std::set. Here is a Q&A explaining how to do it.
You cannot call std::sort on std::set, but you can construct set with predicate.
typedef std::set<myPair, Cmp> Graph;
You'd need another set with a different predicate. You can't change the sorting algorithm of an existing set.
I have the following C++ code
#include <set>
#include <string>
#include <iostream>
using namespace std;
class Pair {
public:
string lhs;
string rhs;
Pair();
Pair( string l, string r ) {
lhs=l;
rhs=r;
};
};
struct compare {
bool operator()(const Pair& a, const Pair& b) const{
if ( ( a.lhs == b.lhs && a.rhs == b.rhs ) || ( a.lhs == b.rhs && a.rhs == b.lhs ) ) {
cout << "MATCH" << endl;
}
return ( a.lhs == b.lhs && a.rhs == b.rhs ) || ( a.lhs == b.rhs && a.rhs == b.lhs );
}
};
int main () {
set<Pair, compare > s;
Pair p( string("Hello"), string("World") );
s.insert(p);
cout << s.size() << "\n";
Pair q( string("World"), string("Hello") );
s.insert(q);
cout << s.size() << "\n";
compare cmp;
cout << cmp( p, q );
return 0;
}
Invoking the compiled code gives:
1
MATCH
MATCH
2
MATCH
Somehow the set s ends up with both Pairs p, and q in spite of the fact that the comparator identifies them as identical.
Why?
Any help will be much appreciated!
UPDATE:
Many thanks for the great answers and your kind and professional help.
As you might have guessed already, I am quite a newby to C++.
Anyway, I was wondering, if Antoine's answer could be done with a lambda expression?
Something like:
std::set< …, [](){ my_comparator_code_here } > s;
????
The comparison operator for a std::set (which is an ordered container) needs to identify a strict weak ordering not any arbitrary test you wish. Normally a properly implemented operator< does the job.
If your comparison operator does not provide a strict weak ordered (as yours does not) the behavior will be undefined. There is no way to work around this requirement of the C++ standard.
Note that in certain cases where an equality comparison is needed it will have to use the operator< twice to make the comparison.
Also have you considered using std::pair<std::string, std::string> instead of rolling your own?
I've reread your question about five times now and I'm starting to wonder if what you want is a set of pairs where which string is in first and second doesn't matter as far as the comparison goes. In that case #Antoine has what appears to be the correct solution for you.
A comparator for a set, map or any algorithm such as lower_bound or sort which require an order need to implement a strict weak ordering (basically, behave like a <).
Such an ordering is required to have 3 properties:
irreflexive: not (a < a) is always true
asymmetric: a < b implies not (b < a)
transitive: a < b and b < c imply a < c
Which you will not < has.
Such an ordering defines equivalence classes, which are groups of elements that compare equal according to the ordering (that is not (a < b) and not (b < a) is verified). In a set or map, only a single element per equivalence class can be inserted whereas a multiset or multimap may hold multiple elements per equivalence class.
Now, if you look at your comparator, you will realize that you have implemented == which does not define any order at all. You need to implement something akin to < instead.
A simple, but extremely efficient trick, is to use tuples which have < (and == and any other comparison operator) already implemented in a lexicographical order. Thus, std::tuple<std::string, std::string> has exactly the order you which; and even better, std::tuple<std::string const&, std::string const&> also has it, and can be constructed very easily using std::tie.
Therefore, the implementation of a straightforward comparator is as simple as:
struct comparator {
bool operator()(Pair const& left, Pair const& right) const {
return std::tie( left.a, left.b)
< std::tie(right.a, right.b);
}
};
Note: although not discussed much, it is absolutely essential that the ordering of the comparator be stable across calls. As such, it should generally only depend on the values of the elements, and nothing external or runtime-related (such as their addresses in memory)
EDIT: as noted, your comparator is slightly more complicated.
In your case, though, you also need to take into account that a and b have a symmetric role. In general, I would suggest uniquifying the representation in the constructor of the object; if not possible, you can uniquify first and compare second:
struct comparator {
bool operator()(Pair const& left, Pair const& right) const {
auto uleft = left.a < left.b ? std::tie(left.a, left.b)
: std::tie(left.b, left.a);
auto uright = right.a < right.b ? std::tie(right.a, right.b)
: std::tie(right.b, right.a);
assert(get<0>(uleft) <= get<1>(uleft) and "Incorrect uleft");
assert(get<0>(uright) <= get<1>(uright) and "Incorrect uright");
return uleft < uright;
}
}; // struct comparator
As Mark B said compare represents an ordering and not an equality, by default it is std::less. In your case, you don't want the comparison to depend on the order in your pair, but at the same time, your operator< must be satisfy a number of conditions.
All the answers here propose to change your specification and make the comparison order-dependant. But if you don't want that, here is the solution:
bool operator()(const Pair & a, const Pair & b) {
const bool swapA = a.lhs < a.rhs;
const std::string & al = swapA ? a.lhs : a.rhs;
const std::string & ar = swapA ? a.rhs : a.lhs;
const bool swapB = b.lhs < b.rhs;
const std::string & bl = swapB ? b.lhs : b.rhs;
const std::string & br = swapB ? b.rhs : b.lhs;
return al < bl || (al == bl && ar < br);
}
At least, it works on your example, and the relation is reflexive and transitive.
Here is how it works: it is the lexicographic order for pairs: al < bl || (al == bl && ar < br), applied to sorted pairs.
In fact your data structure is a (set of size N) of (set of size 2). Internally, std::set sorts its elements using your comparison operators. For your "set of size 2" Pair you also need to consider them as internally sorted.
If the comparison code looks too heavy, you could move the pair sorting into the Pair class, like implement two methods min() and max(). Also, you implement operator< and then don't need a compare class:
struct Pair {
string lhs, rhs;
Pair();
Pair( string l, string r ) : lhs(l), rhs(r) {}
const std::string & min() const { return lhs < rhs ? lhs : rhs; }
const std::string & max() const { return lhs < rhs ? rhs : lhs; }
bool operator<(const Pair& b) const {
return min() < b.min() || (min() == b.min() && max() < b.max());
}
};
from here
The set object uses this expression to determine both the order the elements follow in the container and whether two element keys are equivalent (by comparing them reflexively: they are equivalent if !comp(a,b) && !comp(b,a)). No two elements in a set container can be equivalent.
Sorry all jumped the gun becuase I disliked another answer. I will exapand and correct momentarily. AS pointed out, an order needs to be implemented. typcially this would be a lexicographical order. Importantly however you still need to make sure that the case for which you consider two pairs to be equal returns false for both cases.
if (( a.lhs == b.lhs && a.rhs == b.rhs ) || ( a.lhs == b.rhs && a.rhs == b.lhs )) return false;
//ordinary lexicographical compare
if( a.lhs < b.lhs) return true;
else if( a.lhs == b.lhs && a.rhs < b.rhs) return true;
else return false;
Notic the "!", simple. Your code is saying pair one is less than pair two which is less than pair one. You want it to say that neither is less than the other.
DISCLAIMER STILL WRONG ON A TECHNICALITY, ANTOINE'S IS THE CORRECT ONE
I'm trying to sort a concurrent_vector type, where hits_object is:
struct hits_object{
unsigned long int hash;
int position;
};
Here is the code I'm using:
concurrent_vector<hits_object*> hits;
for(i=0;...){
hits_object *obj=(hits_object*)malloc(sizeof(hits_object));
obj->position=i;
obj->hash=_prevHash[tid];
hits[i]=obj;
}
Now I have filled up a concurrent_vector<hits_object*> called hits.
But I want to sort this concurrent_vector on position property!!!
Here is an example of what's inside a typical hits object:
0 1106579628979812621
4237 1978650773053442200
512 3993899825106178560
4749 739461489314544830
1024 1629056397321528633
5261 593672691728388007
1536 5320457688954994196
5773 9017584181485751685
2048 4321435111178287982
6285 7119721556722067586
2560 7464213275487369093
6797 5363778283295017380
3072 255404511111217936
7309 5944699400741478979
3584 1069999863423687408
7821 3050974832468442286
4096 5230358938835592022
8333 5235649807131532071
I want to sort this based on the first column ("position" of type int). The second column is "hash" of type unsigned long int.
Now I've tried to do the following:
std::sort(hits.begin(),hits.end(),compareByPosition);
where compareByPosition is defined as:
int compareByPosition(const void *elem1,const void *elem2 )
{
return ((hits_object*)elem1)->position > ((hits_object*)elem2)->position? 1 : -1;
}
but I keep getting segmentation faults when I put in the line std::sort(hits.begin(),hits.end(),compareByPosition);
Please help!
Your compare function needs to return a boolean 0 or 1, not an integer 1 or -1, and it should have a strongly-typed signature:
bool compareByPosition(const hits_object *elem1, const hits_object *elem2 )
{
return elem1->position < elem2->position;
}
The error you were seeing are due to std::sort interpreting everything non-zero returned from the comp function as true, meaning that the left-hand side is less than the right-hand side.
NOTE : This answer has been heavily edited as the result of conversations with sbi and Mike Seymour.
int (*)(void*, void*) is the comparator for C qsort() function. In C++ std::sort() the prototype to the comparator is:
bool cmp(const hits_object* lhs, const hits_object* rhs)
{
return lhs->position < rhs->position;
}
std::sort(hits.begin(), hits.end(), &cmp);
On the other hand, you can use std::pair struct, which by default compares its first fields:
typedef std::pair<int position, unsigned long int hash> hits_object;
// ...
std::sort(hits.begin(), hits.end());
Without knowing what concurrent_vector is, I can't be sure what's causing the segmentation fault. Assuming it's similar to std::vector, you need to populate it with hits.push_back(obj) rather than hits[i] = j; you cannot use [] to access elements beyond the end of a vector, or to access an empty vector at all.
The comparison function should be equivalent to a < b, returning a boolean value; it's not a C-style comparison function returning negative, positive, or zero. Also, since sort is a template, there's no need for C-style void * arguments; everything is strongly typed:
bool compareByPosition(hits_object const * elem1, hits_object const * elem2) {
return elem1->position < elem2->position;
}
Also, you usually don't want to use new (and certainly never malloc) to create objects to store in a vector; the simplest and safest container would be vector<hits_object> (and a comparator that takes references, rather than pointers, as arguments). If you really must store pointers (because the objects are expensive to copy and not movable, or because you need polymorphism - neither of which apply to your example), either use smart pointers such as std::unique_ptr, or make sure you delete them once you're done with them.
The third argument you pass to std::sort() must have a signature similar to, and the semantics of, operator<():
bool is_smaller_position(const hits_object* lhs, const hits_object* rhs)
{
return lhs->position < rhs->position;
}
When you store pointers in a vector, you cannot overload operator<(), because smaller-than is fixed for all built-in types.
On a sidenote: Do not use malloc() in C++, use new instead. Also, I wonder why you are not using objects, rather than pointers. Finally, if concurrent_vector is anything like std::vector, you need to explicitly make it expand to accommodate new objects. This is what your code would then look like:
concurrent_vector<hits_object*> hits;
for(i=0;...){
hits_object obj;
obj.position=i;
obj.hash=_prevHash[tid];
hits.push_back(obj);
}
This doesn't look right:
for(i=0;...){
hits_object *obj=(hits_object*)malloc(sizeof(hits_object));
obj->position=i;
obj->hash=_prevHash[tid];
hits[i]=obj;
}
here you already are sorting the array based on 'i' because you set position to i as well as it becomes the index of hits!
also why using malloc, you should use new(/delete) instead. You could then create a simple constructor for the structure to initialize the hits_object
e.g.
struct hits_object
{
int position;
unsigned int hash;
hits_object( int p, unsigned int h ) : position(p), hash(h) {;}
};
then later write instead
hits_object* obj = new hits_object( i, _prevHash[tid] );
or even
hits.push_back( new hits_object( i, _prevHash[tid] ) );
Finally, your compare function should use the same data type as vector for its arguments
bool cmp( hits_object* p1, hits_object* p2 )
{
return p1->position < p2->position;
}
You can add a Lambda instead of a function to std::sort.
struct test
{
int x;
};
std::vector<test> tests;
std::sort(tests.begin(), tests.end(),
[](const test* a, const test* b)
{
return a->x < b->x;
});
I want to map objects of a given class to objects of another. The class I want to use as key, however, was not written by me and is a simple struct with a few values. std::map orders it's contents, and I was wondering how it does it, and if any arbitrary class can be used as a key or if there's a set of requirements (operators and what not) that need to be defined.
If so, I could create a wrapper for the class implementing the operators map uses. I just need to know what I need to implement first, and none of the references for the class I found online specify them.
All that is required of the key is that it be copiable and assignable.
The ordering within the map is defined by the third argument to the
template (and the argument to the constructor, if used). This
defaults to std::less<KeyType>, which defaults to the < operator,
but there's no requirement to use the defaults. Just write a comparison
operator (preferably as a functional object):
struct CmpMyType
{
bool operator()( MyType const& lhs, MyType const& rhs ) const
{
// ...
}
};
Note that it must define a strict ordering, i.e. if CmpMyType()( a, b
) returns true, then CmpMyType()( b, a ) must return false, and if
both return false, the elements are considered equal (members of the
same equivalence class).
You need to define the operator<, for example like this :
struct A
{
int a;
std::string b;
};
// Simple but wrong as it does not provide the strict weak ordering.
// As A(5,"a") and A(5,"b") would be considered equal using this function.
bool operator<(const A& l, const A& r )
{
return ( l.a < r.a ) && ( l.b < r.b );
}
// Better brute force.
bool operator<(const A& l, const A& r )
{
if ( l.a < r.a ) return true;
if ( l.a > r.a ) return false;
// a are equal, compare b
return ( l.b < r.b );
}
// This can often be seen written as
bool operator<(const A& l, const A& r )
{
// This is fine for a small number of members.
// But I prefer the brute force approach when you start to get lots of members.
return ( l.a < r.a ) ||
(( l.a == r.a) && ( l.b < r.b ));
}
The answer is actually in the reference you link, under the description of the "Compare" template argument.
The only requirement is that Compare (which defaults to less<Key>, which defaults to using operator< to compare keys) must be a "strict weak ordering".
Same as for set: The class must have a strict ordering in the spirit of "less than". Either overload an appropriate operator<, or provide a custom predicate. Any two objects a and b for which !(a<b) && !(b>a) will be considered equal.
The map container will actually keep all the elements in the order provided by that ordering, which is how you can achieve O(log n) lookup and insertion time by key value.