C++ std::copy from std::deque to std:;set - c++

I have a class with "array of array" private member represented as:
std::deque<std::deque<SomeClass> > someArray_;
Also this class have a public method which allows to receive all unique SomeClass instances, containing in someArray_. Unique for SomeClass instances means different by at least one of several class members. I decided to use std::set for that purpose. This method has prototype like following:
std::set<SomeClass> getAllUniqueInstances() const;
In this method implementation I use following construction to populate std::set:
std::set<SomeClass> allUniqueInstances;
for(auto it = std::begin(someArray_); it != std::end(someArray_); ++it){
std::copy((*it).begin(),
(*it).end(),
std::inserter(allUniqueInstances, allUniqueInstances.end()));
}
operator<() is defined for SomeClass class. As a result my std::set is populated, but huge amount of instances is missed. Modyfing operator<() for SomeClass class, alters situation, but breaks desirable sorting order. How in this case std::copy determines whether considerable instance is unique?
UPD: source code for SomeClass
class SomeClass{
private:
uint32_t from_;
uint32_t to_;
double capacity_;
double flow_;
public:
...
bool operator<(const SomeClass& rhs) const;
...
};
I want SomeClass instances to be ordered in set by from_ member:
bool SomeClass::operator<( const SomeClass& rhs ) const{
if(this->from_ < rhs.from_)
return true;
return false;
}

It is not std::copy who decides whether instances are unique, but std::set. The logic is something like
(A < B is false) and (B < A is false)
So the criterion that defines the ordering also defines the "uniqueness". It seems like std::set is the wrong data structure for this problem, or your ordering criteria are either incorrect (as in not implementing strict weak ordering), or too broad to suit the problem (as in, you order based on a small number of attributes when you could use more).
Here is an example of a lexicographical comparison using more attributes than you currently have:
#include <tuple> // for std::tie
bool SomeClass::operator<( const SomeClass& rhs ) const
{
return std::tie(from_, to_, capacity_, flow_) < std::tie(rhs.from_, rhs.to_, rhs.capacity_, rhs.flow_);
}

Related

the Comparison Functions in C++

I am currently learning STL in C++. And I was looking at references on a program I'm coding. It is using priority queue with a custom object.
struct Foo
{
std::list<int> path;
int cost;
bool operator>(const Foo& rhs) const
{
return cost > rhs.cost;
}
};
class mycomparison
{
public:
bool operator() (Foo p1, Foo p2) const{
return (p1>p2);
}
};
priority_queue<Foo,vector<Foo>,mycomparison> myPQ;
The objects in the priority queue is prioritized for ones with lower cost. I know that you are able to define custom comparators. But I'm not sure why there is an overloaded operator in the struct, and a custom one in class mycomparison, which is used in the priority queue. If I removed the overloaded operator in the struct, it refuses to run.
If someone would please explain to me the use of both code, the relations, how it affects one another, it would be much appreciated!
Thank you.
std::priority_queue uses std::less<T> as the default comparator.
If and when that is not appropriate, as in your case, you have to define a custom comparator or use another comparator that is appropriate for your need.
The implementation details of the operator() function of the comparator is entirely up to you.
Do you need the operator> function in Foo? Certainly not. You could have used:
class mycomparison
{
public:
bool operator() (Foo p1, Foo p2) const{
return (p1.cost > p2.cost);
}
};
That could have obviated the need to implement Foo::operator>.
However, using return (p1 > p2) keeps the abstractions in the right place. The details of what p1 > p2 means is best left up to Foo.
BTW, you could have used:
std::priority_queue<Foo, std::vector<Foo>, std::greater<Foo>> myPQ;
That would have made mycomparison unnecessary.

How to use insert in the set in c++ for user defined data type?

I would like to use a set<vector<data>> where data is a user-defined class and both the set and the vector are STL,
class data
{
int info;
};
I am not able to understand whether we need to define comparator operator for both vector<data> and data class or only data class.
And how do we define the comparator operator for the same?
std::vector already has an ordering - lexicographical order - so you normally don't need to do anything with that.
You always need to define an ordering for your own classes if you use the default vector ordering (see example below for a case where you don't need to), and the most common way is to overload operator<.
Note that the ordering relation must be a strict weak ordering, or using the set is undefined.
If you want a special sense of "equality" for the set, you need to define your own.
For example, this code would make a set where vectors of equal length are considered equal (so only the first one encountered of each length is added to the set):
template<typename T>
struct shorter_vector
{
bool operator() (const std::vector<T>& left, const std::vector<T>& right) const
{
return left.size() < right.size();
}
};
// ...
struct A { int x; };
std::set<std::vector<A>, shorter_vector<A>> samelengths;
samelengths.insert({A{1}});
samelengths.insert({A{2}});
samelengths.insert({A{3},A{4}});
samelengths.insert({A{5},A{67}});
// set now contains {A{1}} and {A{3},A{4}}
Note that this set doesn't need an ordering for the vector's elements, since the equivalence relation is defined on structure alone.

unordered_multimap usage and operator overwriting

I need to use an unordered_multimap for my Note objects and the keys will be the measureNumber member of my objects. I'm trying to implement it as shown here but I'm stuck.
First off, I don't understand why I have to overwrite the operator== before I can use it. I'm also confused about why I need a hash and how to implement it. In this example here, none of those two things is done.
So based on the first example, this is what I have:
class Note {
private:
int measureNumber;
public:
inline bool operator== (const Note &noteOne, const Note &noteTwo);
}
inline bool Note::operator ==(const Note& noteOne, const Note& noteTwo){
return noteOne.measureNumber == noteTwo.measureNumber;
}
I don't know how to implement the hash part though. Any ideas?
std::multimap is based on a sorted binary tree, which uses a less-than operation to sort the nodes.
std::unordered_multimap is based on a hash table, which uses hash and equality operations to organize the nodes without sorting them.
The sorting or hashing is based on the key values. If the objects are the keys, then you need to define these operations. If the keys are of predefined type like int or string, then you don't need to worry about it.
The problem with your pseudocode is that measureNumber is private, so the user of Note cannot easily specify the key to the map. I would recommend making measureNumber public or rethinking the design. (Is measure number really a good key value? I'm guessing this is musical notation.)
std::multimap< int, Note > notes;
Note myNote( e_sharp, /* octave */ 3, /* measure */ 5 );
notes.insert( std::make_pair( myNote.measureNumber, myNote ) );
The objects can be keys and values at the same time, if you use std::multiset or std::unordered_multiset, in which case you would want to define the operator overload (and possibly hash). If operator== (or operator<) is a member function, then the left-hand side becomes this and the right-hand side becomes the sole argument. Usually these functions should be non-member friends. So then you would have
class Note {
private:
int measureNumber;
public:
friend bool operator< (const Note &noteOne, const Note &noteTwo);
}
inline bool operator <(const Note& noteOne, const Note& noteTwo){
return noteOne.measureNumber < noteTwo.measureNumber;
}
This class could be used with std::multiset. To perform a basic lookup, you can construct a dummy object with uninitialized values except for measureNumber — this only works for simple object types.
I need to use an unordered_multimap for my Note objects and the keys
will be the measureNumber member of my objects.
OK - I'm not sure whether you're after a multiset, unordered_multiset, multimap, or unordered_multimap. I know your title refers to unordered_multimap, but the link you provided leads to unordered_multiset. There are a multitude of considerations which should be taken into account when choosing a container, but second-guessing which will be the best-performing without profiling is a risky business.
I don't understand why I have to overwrite the operator== before I can use it.
I'm also confused about why I need a hash and how to implement it.
In this example here, none of those two things is done.
You need the operator== and std::hash as they're used internally by unordered_multimap and unordered_multiset. In the example you linked to, the key is of type int, so operator== and std::hash<int> are already defined. If you choose to use Note as a key, you have to define these yourself.
I'd recommend starting with a multiset if you don't need to change the elements frequently. If you do want to be able to change Notes without erasing and inserting, I'd recommend removing measureNumber as a member of Note and using a multimap<int, Note>.
If you feel an unordered_ version of your container would better suit your needs, you still have the set vs map choice. If you choose unordered_multimap<int, Note> (having removed measureNumber from Note), then as in your linked example, the key is int. So you won't have to define anything special for this to work. If you choose to keep measureNumber as a member of Note and use unordered_multiset<Note>, then Note is the key and so you need to do further work, e.g.
#include <functional>
#include <unordered_set>
class Note; // Forward declaration to allow specialisation of std::hash<>
namespace std {
template<>
class hash<Note> {
public:
size_t operator()(const Note &) const; // declaration of operator() to
// allow befriending by Note
};
}
class Note {
private:
int measureNumber;
public:
// functions befriended to allow access to measureNumber
friend bool operator== (const Note &, const Note &);
friend std::size_t std::hash<Note>::operator()(const Note &) const;
};
inline bool operator== (const Note &noteOne, const Note &noteTwo) {
return noteOne.measureNumber == noteTwo.measureNumber;
}
std::size_t std::hash<Note>::operator()(const Note &note) const {
return std::hash<int>()(note.measureNumber);
}
This lets you create and use std::unordered_multiset<Note>. However, I'm not sure this is really what you need; you could even find that a sorted std::vector<Note> is best for you. Further research and thought as to how you'll use your container along with profiling should give the best answer.

How to make set:: find() work for custom class objects?

I'm a bit confused about using STL set::find() for a set of my own defined class objects.
My class contains more than two items (3/4/5 etc.), so how can I overload less operator?
I tried for 3 variable, which is as follows and working fine:
return( (a1.i < a2.i) ||
(!(a1.i > a2.i) && (a1.f < a2.f)) ||
(!(a1.i > a2.i) && !(a1.f > a2.f) && (a1.c < a2.c)));
where, a1, and a2 are class objects and (i, f and c are class members).
Now I want to generalize this for n members, but my find() does not always work.
I've been looking through STL's detailed documentation, trying to learn how set::find() is implemented, and why it needs less (<) operator overloading.
I referred to sgi and msdn documentation, but I could not find much about implementation details of set::find() there, either.
What am I doing wrong in my set::find() implementation?
You can use a tuple to easily get an lexicographical ordering of your members:
return std::tie(lhs.i, lhs.f, lhs.c) < std::tie(rhs.i, rhs.f, rhs.c);
This requires that every member be of a comparable type, e.g. lhs.i < rhs.i makes sense.
Note that std::tie and std::tuple are only available for C++11, so for C++03 you can use e.g. Boost.Tuple which does provide a boost::tie (boost::tuple uses the same ordering as std::tuple).
As to where this should go, it is customary to put that in an operator< (after all this is what make the use of tie for an easy ordering possible in the first place). Quite often this operator will be a friend, so this would look like:
class foo {
public:
/* public interface goes here */
// declaration of non-member friend operator
// if it doesn't need to be a friend, this declaration isn't needed
friend
bool operator<(foo const& lhs, foo const& rhs);
private:
T t;
U u;
V v;
};
bool operator<(foo const& lhs, foo const& rhs)
{
// could be boost::tie
return std::tie(lhs.t, lhs.u, lhs.v) < std::tie(rhs.t, rhs.u, rhs.v);
}
As you can see it's not fully automatic as the implementation of operator< needs to list every member of foo (or at least those that matter for the ordering), twice. There isn't a better way I'm afraid.
Instead of providing an operator< you can specialize std::less for foo but that's a bit exotic and not the preferred way. If the ordering would still not make sense to be part of the extended interface of foo (e.g. there might be more than one ordering that makes sense without a canonical one), then the preferred way is to write a functor:
struct foo_ordering {
bool operator()(foo const& lhs, foo const& rhs) const
{
/* implementation as before, but access control/friendship
has to be planned for just like for operator< */
}
};
Then you'd use e.g. std::set<foo, foo_ordering>.
Be aware that no matter what form the ordering takes (through either operator<, std::less<foo> or a functor) if it is used with an std::set or any other associative container (and by default e.g. std::set<T> uses std::less<T> which in turn uses operator< by default) it must follow some stringent criteria, i.e. it must be a strict weak ordering. However if all the members that are used for the foo ordering themselves have SW orderings then the resulting lexicographical ordering is also a SW ordering.
You have to define a strict ordering of your objects. So if your object is made up of n members a_1 .. a_n which all have a strict ordering themselves, what you can do is:
bool operator< (const TYPE &rhs) {
if (a_1 < rhs.a_1) return true; else if (a_1 > rhs.a_1) return false;
if (a_2 < rhs.a_2) return true; else if (a_2 > rhs.a_2) return false;
...
if (a_n < rhs.a_n) return true;
return false;
}
Edit:
If either boost or C++11 is an option for you, you should really go with the std::tie/boost::tie method Luc Danton suggests in his answer. It's much cleaner.
std::set element comparison function should define Strict Weak Ordering relation on elements domain. Using this definition we can say that two elements are equivalent if compare( a, b ) is false and compare( b, a ) is false too. std::find can be implemented using this assumption.
You can find more here: http://www.sgi.com/tech/stl/set.html and http://www.sgi.com/tech/stl/StrictWeakOrdering.html
Your operator < should be capable to compare every object with given one, like that
struct Data
{
bool operator < (const Data& right) const
{
return( (this.i < right.i) ||
(!(this.i > right.i) && (this.f < right.f)) ||
(!(this.i > right.i) && !(this.f > right.f) && (this.c < right.c)));
}
}
Also, your compare algorithm looks doubtful, because it doees not consider cases, when
this.i == right.i
or
this.f == right.f
And you actually should not be interested in std::set implementation. It can change from compiler to compiler and can be modified in future. Your program should make assumptions only about container interface, never implementation.
This is only a partial answer, but a detailed documentation of STL can be found on the website of SGI.

C++ class hierarchy for collection providing iterators

I'm currently working on a project in which I'd like to define a generic 'collection' interface that may be implemented in different ways. The collection interface should specify that the collection has methods that return iterators by value. Using classes that wrap pointers I came up with the following (greatly simplified):
Collection.h
class Collection
{
CollectionBase *d_base;
public:
Collection(CollectionBase *base);
Iterator begin() const;
};
inline Iterator Collection::begin() const
{
return d_base->begin();
}
CollectionBase.h
class CollectionBase
{
public:
virtual Iterator begin() const = 0;
virtual Iterator end() const = 0;
};
Iterator.h
class Iterator
{
IteratorBase *d_base;
public:
bool operator!=(Iterator const &other) const;
};
inline bool Iterator::operator!=(Iterator const &other) const
{
return d_base->operator!=(*other.d_base);
}
IteratorBase.h
class IteratorBase
{
public:
virtual bool operator!=(IteratorBase const &other) const = 0;
};
Using this design, different implementations of the collection derive from CollectionBase and can return their custom iterators by returning an Iterator that wraps some specific implementation of IteratorBase.
All is fine and dandy so far. I'm currently trying to figure out how to implement operator!= though. Iterator forwards the call to IteratorBase, but how should the operator be implemented there? One straightforward way would be to just cast the IteratorBase reference to the appropriate type in implementations of IteratorBase and then perform the specific comparison for the implementation of IteratorBase. This assumes that you will play nice and not pass two different types of iterators though.
Another way would be to perform some type of type checking that checks if the iterators are of the same type. I believe this check will have to be made at run-time though, and considering this is an iterator I'd rather not perform expensive run time type checking in operator!=.
Am I missing any nicer solutions here? Perhaps there are better alternative class designs (the current design is an adaptation from something I learned in a C++ course I'm taking)? How would you approach this?
Edit: To everyone pointing me to the STL containers: I am aware of their existence. I cannot use these in all cases however, since the amounts of data I need to process are often enormous. The idea here is to implement a simple container that uses the disk as storage instead of memory.
This is not the way you should be using C++. I strongly suggest you investigate the standard library container classes, such as std::vector and std::map, and the use of templates. Inheritance should always be the design tool of last resort.
Please do mimic the STL way of doing containers. That way, it would be possible to e.g. use <algorithm> with your containers.
If you want to use inheritance for your iterators, I would recommend you to use a different approach than STL's begin()/end().
Have a look on IEnumerator from .NET framework, for example. (MSDN documentation)
Your base classes can look like this:
class CollectionBase
{
// ...
virtual IteratorBase* createIterator() const = 0;
};
class IteratorBase
{
public:
virtual bool isEnd() const = 0;
virtual void next() const = 0;
};
// usage:
for (std::auto_ptr<IteratorBase> it = collection.createIterator(); !it->isEnd(); it->next)
{
// do something
}
If you want to stay with begin()/end(), you can use dynamic_cast to check that you have a right type:
class MyIteratorBaseImpl
{
public:
virtual bool operator!=(IteratorBase const &other) const
{
MyIteratorBaseImpl * other2 = dynamic_cast<MyIteratorBaseImpl*>(&other);
if (!other2)
return false; // other is not of our type
// now you can compare to other2
}
}
I can advice you add to iterator a virtual 'entiy-id' function, and in operator!= checks this->entity_id () and other.entity_id () (my example, 'position' function is such 'entity-id' function).