Large POD as tuple for sorting - c++

I have a POD with about 30 members of various types and I will be wanting to store thousands of the PODs in a container, and then sort that container by one of those members.
For example:
struct Person{
int idNumber;
....many other members
}
Thousands of Person objects which I want to sort by idNumber or by any other member I choose to sort by.
I've been researching this for a while today and it seems the most efficient, or at least, simplest, solution to this is not use struct at all, and rather use tuple for which I can pass an index number to a custom comparison functor for use in std::sort. (An example on this page shows one way to implement this type of sort easily, but does so on a single member of a struct which would make templating this not so easy since you must refer to the member by name, rather than by index which the tuple provides.)
My two-part question on this approach is 1) Is it acceptable for a tuple to be fairly large, with dozens of members? and 2) Is there an equally elegant solution for continuing to use struct instead of tuple for this?

You can make a comparator that stores a pointer to member internaly so it knows which member to take for comparison:
struct POD {
int i;
char c;
float f;
long l;
double d;
short s;
};
template<typename C, typename T>
struct Comp {
explicit Comp(T C::* p) : ptr(p) {}
bool operator()(const POD& p1, const POD& p2) const
{
return p1.*ptr < p2.*ptr;
}
private:
T C::* ptr;
};
// helper function to make a comparator easily
template<typename C, typename T>
Comp<C,T> make_comp( T C::* p)
{
return Comp<C,T>(p);
}
int main()
{
std::vector<POD> v;
std::sort(v.begin(), v.end(), make_comp(&POD::i));
std::sort(v.begin(), v.end(), make_comp(&POD::d));
// etc...
}
To further generalize this, make make_comp take a custom comparator, so you can have greater-than and other comparisons.

1) Is it acceptable for a tuple to be fairly large, with dozens of members?
Yes it is acceptable. However it won't be easy to maintain since all you'll have to work with is an index within the tuple, which is very akin to a magic number. The best you could get is reintroduce a name-to-index mapping using an enum which is hardly maintainable either.
2) Is there an equally elegant solution for continuing to use struct instead of tuple for this?
You can easily write a template function to access a specific struct member (to be fair, I didn't put much effort into it, it's more a proof of concept than anything else so that you get an idea how it can be done):
template<typename T, typename R, R T::* M>
R get_member(T& o) {
return o.*M;
}
struct Foo {
int i;
bool j;
float k;
};
int main() {
Foo f = { 3, true, 3.14 };
std::cout << get_member<Foo, float, &Foo::k>(f) << std::endl;
return 0;
}
From there, it's just as easy to write a generic comparator which you can use at your leisure (I'll leave it to you as an exercise). This way you can still refer to your members by name, yet you don't need to write a separate comparator for each member.

You could use a template to extract the sort key:
struct A
{
std::string name;
int a, b;
};
template<class Struct, typename T, T Struct::*Member>
struct compare_member
{
bool operator()(const Struct& lh, const Struct& rh)
{
return lh.*Member < rh.*Member;
}
};
int main()
{
std::vector<A> values;
std::sort(begin(values), end(values), compare_member<A, int, &A::a>());
}
Maybe you want to have a look at boost::multi_index_container which is a very powerful container if you want to index (sort) object by different keys.

Create a class which can use a pointer to a Person member data to use for comparison:
std::sort(container.begin(), container.end(), Compare(&Person::idNumber));
Where Compare is:
template<typename PointerToMemberData>
struct Compare {
Compare(PointerToMemberData pointerToMemberData) :
pointerToMemberData(pointerToMemberData) {
}
template<typename Type
bool operator()(Type lhs, Type rhs) {
return lhs.*pointerToMemberData < rhs.*pointerToMemberData
}
PointerToMemberData pointerToMemberData;
};

Related

Template in which one argument depends on another argument

I have the following situation. Let say we want to implement a sorted array data structure which keeps the array sorted upon insertion. At first attempt, I would do something like:
template<typename T, typename Comparator, Comparator comparator>
SortedArray {
public:
void find(T value);
void insert(T value);
void remove(T value);
}
The argument T is of course for the type of the elements in the array. Then I need a comparator to tell how to compare objects of type T so that I can keep the elements sorted. Since I want to allow for both function pointers (as in classical qsort) as well as function objects and maybe lambda as well, I need to add the template parameter for the comparator.
Now the problem is that I want the compiler to automatically deduce the 2nd Comparator argument based on the 3rd argument. Right now, a typical usage will be exploiting decltype like
int compare_int(int x, int y) {
return x - y;
}
SortedArray<int, decltype(compare_int), compare_int> myArray;
but this doesn't work with lambda and certainly I would love to just write
SortedArray<int, compare_int> myArray;
instead.
Any idea or is it actually possible in C++ at the moment?
You can non type template parameters as follows:
template<typename T, auto C >
class SortedArray
{
private:
std::vector<T> v;
public:
void sort(){ std::sort( v.begin(), v.end(), C );}
void print() { for( auto& el: v ) std::cout << el << std::endl; }
void push(T t){ v.push_back(t);}
};
bool compare_int( int a, int b )
{
return a<b;
}
int main()
{
SortedArray<int, compare_int> sa1;
sa1.push(5);
sa1.push(3);
sa1.push(7);
sa1.sort();
sa1.print();
SortedArray<int, [](int a, int b){ return a<b;} > sa2;
sa2.push(5);
sa2.push(3);
sa2.push(7);
sa2.sort();
sa2.print();
}
As you can see, you can also use a lambda as template parameter.
There is no need to do any template gymnastic with derived template parameters anymore.

Hashable type with overwritten operators or external functors

To use a custom type in a std::unordered_set I have to options.
1) Implement the == operator for my type and specialize std::hash
struct MyType {
int x;
bool operator==(const MyType& o) {
return this.x == o.x;
}
};
namespace std
{
template<>
struct hash<MyType> {
size_t operator()(const MyType& o) const {
return hash<int>()(o.x);
}
};
}
std::unordered_set<MyType> mySet;
Or 2), provide functor classes:
struct MyTypeHash {
size_t operator()(const MyType& o) const {
return std::hash<int>()(o.x);
}
};
struct MyTypeCompare {
bool operator()(const MyType& o1, const MyType& o2) const {
return o1.x == o2.x;
}
};
std::unordered_set<MyType, MyTypeHash, MyTypeCompare> mySet;
The second approach lets me choose new behaviour for every new instantion of std::unordered_set, while with the first approach the behaviour as being part of the type itself will always be the same.
Now, if I know that I only ever want a single behaviour (I'll never define two different comparators for MyType), which approach is to be preferred? What other differences exist between those two?
Attaching the behavior to the type allows for code like
template<template<class> Set,class T>
auto organizeWithSet(…);
/* elsewhere */ {
organizeWithSet<std::unordered_set,MyType>(…);
organizeWithSet<std::set,MyType>(…);
}
which obviously cannot pass custom function objects.
That said, it is possible to define
template<class T>
using MyUnorderedSet=std::unordered_set<T, MyTypeHash,MyTypeCompare>;
and use that as a template template argument, although that introduces yet another name and might be considered less readable.
Otherwise, you have to consider that your operator== is simultaneously the default for std::unordered_set and std::find, among others; if the equivalence you want for these purposes varies, you probably want named comparators. On the other hand, if one suffices, C++20 might even let you define it merely with =default.

Comparer that takes the wanted attribute

In order to use a standard function like std::sort on some standard container Container<T>
struct T{
int x,y;
};
based on the y value, you need to write something like (for example):
std::vector<T> v;
//fill v
std::sort(v.begin(),v.end(),[](const auto& l,const auto& r){
return l.y<r.y;
});
The comparer that was written as lambda function is used too much and re-written again and again during the code for various classes and attributes.
Considering the case where y's type is comparable (either like int or there is an overload for the < operator), is there any way to achieve something like:
std::sort(v.begin(),v.end(),imaginary::less(T::y)); // Imaginary code
Is it possible in C++ to write such a function like less? or anything similar?
I am asking because I remember something like that in some managed language (I am not sure maybe C# or Java). However, I am not sure even about this information if it is true or not.
template<typename T, typename MT>
struct memberwise_less
{
MT T::* const mptr;
auto operator()(const T& left, const T& right) const
{ return (left.*mptr) < (right.*mptr); }
};
template<typename T, typename MT>
memberwise_less<T, MT> member_less(MT T::*mptr)
{
return { mptr };
}
and then you can do
std::sort(v.begin(), v.end(), member_less(&T::y));

Modify std::less on a shared_ptr

This is what I have:
struct Foo {
int index;
}
std::set<std::shared_ptr<Foo>> bar;
I want to order bar's elements by their indices instead of by the default std::less<std::shared_ptr<T>> function, which relates the pointers.
I read I can type std::set<std::shared_ptr<Foo>, std::owner_less<std::shared_ptr<Foo>>> bar, but I'd prefer to stick to the previous syntax.
I tried defining std::less<std::shared_ptr<Foo>>, but it's not actually being used by the set functions. Is there a way I can achieve this?
If you want to compare by their indices, you'll have to write a comparator that checks by their indices. std::less<> will do the wrong thing (since it won't know about index) and std::owner_less<> will do the wrong thing (since it still won't compare the Foos, but rather has to do with ownership semantics of them).
You have to write:
struct SharedFooComparator {
bool operator()(const std::shared_ptr<Foo>& lhs,
const std::shared_ptr<Foo>& rhs) const
{
return lhs->index < rhs->index;
}
};
and use it:
std::set<std::shared_ptr<Foo>, SharedFooComparator> bar;
You could additionally generalize this to a generic comparator for shared_ptr's:
struct SharedComparator {
template <typename T>
bool operator()(const std::shared_ptr<T>& lhs,
const std::shared_ptr<T>& rhs) const
{
return (*lhs) < (*rhs);
}
};
and then simply make Foo comparable.
You can provide your own specialization of less<shared_ptr<Foo>> in the std namespace.
namespace std
{
template<>
class less<shared_ptr<Foo>>
{
public:
bool operator()(const shared_ptr<Event>& a, const shared_ptr<Event>& b)
{
// Compare *a and *b in some way
}
};
}
Then you can form a set<shared_ptr<Foo>> without a comparator. I needed this for a priority_queue<shared_ptr<Foo>>, where I didn't want to use a priority_queue<Foo*, vector<Foo*>, int (*)(const Foo*, const Foo*)>. I am not proud of it, but it works.

c++: can i process std::_Container_base without knowing if it is a map or a vector?

guess I have a
class C1 : public B { /*...*/ };
class C2 : public B { /*...*/ };
std::map<std::string, C1> myMap;
std::vector<C2> myVector;
Is there a way (and what would be the syntax) to call a function foo that…
just needs to process the functionalities of B
just needs to process them on all elements of map and vector without caring how they are organized?
std::vector and std::map are both std::_Container_base's but i have no clue how to write the syntax for (pseudocode):
void foo(std::_Container_base-of-Bs)
EDIT: it's _Container_base, not _Tee
The C++ way is to use templates and iterators.
template <typename ForwardIterator>
void process_bs(ForwardIterator first, ForwardIterator last) {
std::for_each(first, last, [](B& b) {
// do something to b here
});
}
For vector, list, deque and set, you can trivially call this using begin and end:
process_bs(v.begin(), v.end());
For map, the element type is pair<const Key, Value>, so you have to adapt the iterators. You can use this with Boost.Range, for example:
#include <boost/range/adaptor/map.hpp>
auto values = m | boost::adaptors::map_values;
process_bs(values.begin(), values.end());
EDIT: The below is a survey on the workarounds, whereas the actual question is not answered therein. So here is the answer: I don't know whether one can process std::_Container_base without knowing if it is a map or a vector.
I couldn't find anything reasonable on the web regarding std::_Container_base, and particularly no C++ standard things, so I would guess it stems from a specific compiler implementation.
vector and map are completely different storage schemes. I suggest you to not use them generically in the same context. That is, from the first you could write a function template
template<typename T> foo(T&& t) { /* takes a vector and a map */ }
but at least when you access operator[], they'll behave differently. That would be unintuitive and error-prone.
However, this doesn't mean you cannot combine the two approaches -- and abstract on size(), operator[](int) and possibly other things like some insertion mechanism.
For example, in some recent code of mine, I have vector-storage scheme (which uses std::vector under the hood), as well as a piecewise constant vector (which uses a std::map). If you want to do this, you can derive those two from a common base class
template<typename T>
struct ContainerBase
{
virtual int size() const = 0;
virtual T operator[](int) const = 0;
virtual void insert(int, T) = 0; //if required
};
and then set up the required functionality in the derived classes Vector and Map.
template<typename T>
struct Vector
{
virtual T operator[](int i) const { return _v[i]; }
virtual T size() const { return _v.size(); }
// ... insert and so on
std::vector<T> _v;
};
template<typename T>
struct Map
{
virtual T operator[](int i) const
{
return *std::lower_bound(i); //add further checks if nothing is found
}
virtual T size() const { return _v.rbegin()->first; // return highest index }
// ... insert and so on
std::map<int, T> _v;
};
The Map implementation is just a sketch. You should choose some reasonable behaviour for it.
With this, it is easy to set up a function foo(ContainerBase&) which works for both Vector and Map.
To use transparently B subclasses into the foo function, you can do this way:
#include <iostream>
#include <map>
#include <vector>
#include <string>
#include <utility>
struct B{
int b_member;
};
class C1 : public B { /*...*/ };
class C2 : public B { /*...*/ };
std::map<std::string, C1> myMap;
std::vector<C2> myVector;
// all the magic is into get_B specializations
template<typename E, typename std::enable_if<std::is_base_of<B, E>::value>::type* a = nullptr>
B& get_B(E& elem)
{
return elem;
}
template<typename E, typename std::enable_if<std::is_base_of<B, typename E::second_type>::value>::type* a = nullptr>
B& get_B(E& elem)
{
return elem.second;
}
// foo can call get_B to hide implementation details of the container
template<typename T>
void foo( T& container)
{
for(auto& elem : container)
{
std::cout << get_B(elem).b_member << '\n';
}
}
int main()
{
myVector.resize(10);
myMap["one"] = {};
foo(myMap);
foo(myVector);
}
Thanks to SFINAE, foo uses the correct specialization of get_B to get a reference to the B subclass you want to process.