How to create a map with custom class/comparator as key

How to create a map with custom class/comparator as key - c++

I have a class named ItemType. It has two members - both double, named m_t and m_f. Two items of type ItemType are considered to be equal if these two members differ from each other within respective tolerance levels. With this logic, the comparator function is so defined as well. However, when I insert objects of this type as key into a map, only one key is produced in the map, even though at least three such keys should be present:
#include <iostream>
#include <string>
#include <map>
#include <cmath>
#include <vector>
using namespace std;
class ItemKey
{
public:
ItemKey(double t, double f)
{
m_t = t;
m_f = f;
}
double m_t;
double m_f;
double m_tEpsilon = 3;
double m_fEpsilon = 0.1;
bool operator<(const ItemKey& itemKey) const
{
int s_cmp = (abs(itemKey.m_f - m_f) > m_fEpsilon);
if (s_cmp == 0)
{
return (abs(itemKey.m_t - m_t) > m_tEpsilon);
}
return s_cmp < 0;
}
};
int main()
{
// The pairs are the respective values of m_t and m_f.
vector<pair<double, double>> pairs;
// These two should belong in one bucket -> (109.9, 9.0), because m_f differs by 0.09 and m_t differs by just 1
pairs.emplace_back(109.9, 9.0);
pairs.emplace_back(110.9, 9.09);
// This one is separate from above two beause even though m_t is in range, m_f is beyong tolerance level
pairs.emplace_back(109.5, 10.0);
// Same for this as well, here both m_t and m_f are beyong tolerance of any of the two categories found above
pairs.emplace_back(119.9, 19.0);
// This one matches the second bucket - (109.5, 10.0)
pairs.emplace_back(109.9, 10.05);
// And this one too.
pairs.emplace_back(111.9, 9.87);
map<ItemKey, size_t> itemMap;
for (const auto& item: pairs)
{
ItemKey key(item.first, item.second);
auto iter = itemMap.find(key);
if (iter == itemMap.end())
{
itemMap[key] = 1;
}
else
{
itemMap[iter->first] = itemMap[iter->first] + 1;
}
}
// The map should have three keys - (109.9, 9.0) -> count 2, (109.5, 10.0) -> count 3 and (119.9, 19.0) -> count 1
cout << itemMap.size();
}
However, the map seems to have only 1 key. How do I make it work as expected?

Why isn't your version working?
You did well to create your own comparison function. To answer your question, you have an error in your operator<() function such that only returns true if m_f is outside of tolerance and m_t is within tolerance, which I'm guessing is not what you desired. Let's take a look.
int s_cmp = (abs(itemKey.m_f - m_f) > m_fEpsilon);
The above line basically is checking whether this->m_f and itemKey.m_f are within tolerance of eachother (meaning equal to each other). That is probably what was intended. Then you say
if (s_cmp == 0)
{
return (abs(itemKey.m_t - m_t) > m_tEpsilon);
}
If s_cmp is true, then it will have the value of 1, and it will have a value of 0 for false (meaning that they are not within tolerance of each other). Then you return true if the m_t value is within tolerance. Up to this point, you return true if m_f is not equal (according to tolerance) and if m_t is equal (according to tolerance). Then your last line of code
return s_cmp < 0;
will return true always since a boolean converted to an integer cannot ever be negative.
How to get it working?
#include <iostream>
#include <string>
#include <map>
#include <cmath>
#include <vector>
struct ItemKey
{
double m_t;
double m_f;
static constexpr double t_eps = 3;
static constexpr double f_eps = 0.1;
ItemKey(double t, double f) : m_t(t), m_f(f) {}
bool operator<(const ItemKey& other) const
{
// Here it is assumed that f_eps and t_eps are positive
// We also ignore overflow, underflow, and NaN
// This is written for readability, and assumed the compiler will be
// able to optimize it.
auto fuzzy_less_than = [] (double a, double b, double eps) {
return a < b - eps;
};
bool f_is_less_than = fuzzy_less_than(this->m_f, other.m_f, f_eps);
bool f_is_greater_than = fuzzy_less_than(other.m_f, this->m_f, f_eps);
bool f_is_equal = !f_is_less_than && !f_is_greater_than;
bool t_is_less_than = fuzzy_less_than(this->m_t, other.m_t, t_eps);
return f_is_less_than || (f_is_equal && t_is_less_than);
}
};
int main()
{
using namespace std;
// The pairs are the respective values of m_t and m_f.
vector<pair<double, double>> pairs;
// These two should belong in one bucket
// -> (109.9, 9.0), because m_f differs by 0.09 and m_t differs by just 1
pairs.emplace_back(109.9, 9.0);
pairs.emplace_back(110.9, 9.09);
// This one is separate from above two beause even though m_t is in range,
// m_f is beyong tolerance level
pairs.emplace_back(109.5, 10.0);
// Same for this as well, here both m_t and m_f are beyong tolerance of any
// of the two categories found above
pairs.emplace_back(119.9, 19.0);
// This one matches the second bucket - (109.5, 10.0)
pairs.emplace_back(109.9, 10.05);
// And this one too.
pairs.emplace_back(111.9, 9.87);
map<ItemKey, size_t> itemMap;
for (const auto& item: pairs)
{
ItemKey key(item.first, item.second);
auto iter = itemMap.find(key);
if (iter == itemMap.end())
{
itemMap[key] = 1;
}
else
{
itemMap[iter->first] = itemMap[iter->first] + 1;
}
}
// The map should have three keys
// - (109.9, 9.0) -> count 2
// - (109.5, 10.0) -> count 3
// - (119.9, 19.0) -> count 1
cout << itemMap.size();
cout << "itemMap contents:" << endl;
for (auto& item : itemMap) {
cout << " (" << item.first << ", " << ")" << endl;
}
return 0;
}
There are a few things I changed above. I have a few suggestions also unrelated to the programming mistake:
Do not store boolean values into integer variables.
There's a reason that C++ introduced the bool type.
Write your code to be readable and in a way that the compiler
can easily optimize. You may notice I used a lambda expression
and multiple booleans. Smart compilers will inline the calls to
that lambda expression since it is only used within the local scope.
Also smart compilers can simplify boolean logic and make it
performant for me.
The m_tEpsilon and m_fEpsilon are probably not good to be
changable variables of the class. In fact, it may be bad if one
object has a different epsilon than another one. If that were the
case, which do you use when you do the < operator? For this
reason, I set them as static const variables in the class.
For constructors, it is better to initialize your variables in the
initializer list rather than in the body of the constructor. That
is unless you are doing dynamic resource allocation, then you would
want to do it in the constructor and make sure to clean it up if
you end up throwing an exception (preferrably using the RAII
pattern). I'm starting to get too far off topic :)
Even though class and struct are basically identical except for
the default protection level (class is private by default and
struct is public by default). It is convention to have it as a
struct if you want direct access to the member variables. Although,
in this case, I would probably set your class as immutable. To do
that, set the m_t and m_f as private variables and have a getter
m() and f(). It might be a bad idea to modify an ItemKey
instance in a map after it has been inserted.
Potential problems with this approach
One of the problems you have with your approach here is that it will be dependent on the order in which you add elements. Consider the following pairs to be added: (3.0, 10.0) (5.0, 10.0) (7.0, 10.0). If we add them in that order, we will get (3.0, 10.0) (7.0, 10.0), since (5.0, 10.0) was deemed to be equal to (3.0, 10.0). But what if we were to have inserted (5.0, 10.0) first, then the other two? Well then the list would only have one element, (5.0, 10.0), since bother of the others would be considered equal to this one.
Instead, I would like to suggest that you use std::multiset instead, of course this will depend on your application. Consider these tests:
void simple_test_map() {
std::map<ItemKey, size_t> counter1;
counter1[{3.0, 10.0}] += 1;
counter1[{5.0, 10.0}] += 1;
counter1[{7.0, 10.0}] += 1;
for (auto &itempair : counter1) {
std::cout << "simple_test_map()::counter1: ("
<< itempair.first.m_t << ", "
<< itempair.first.m_f << ") - "
<< itempair.second << "\n";
}
std::cout << std::endl;
std::map<ItemKey, size_t> counter2;
counter2[{5.0, 10.0}] += 1;
counter2[{3.0, 10.0}] += 1;
counter2[{7.0, 10.0}] += 1;
for (auto &itempair : counter2) {
std::cout << "simple_test_map()::counter2: ("
<< itempair.first.m_t << ", "
<< itempair.first.m_f << ") - "
<< itempair.second << "\n";
}
std::cout << std::endl;
}
This outputs:
simple_test_map()::counter1: (3, 10) - 2
simple_test_map()::counter1: (7, 10) - 1
simple_test_map()::counter2: (5, 10) - 3
And for the multiset variant:
void simple_test_multiset() {
std::multiset<ItemKey> counter1 {{3.0, 10.0}, {5.0, 10.0}, {7.0, 10.0}};
for (auto &item : counter1) {
std::cout << "simple_test_multiset()::counter1: ("
<< item.m_t << ", "
<< item.m_f << ")\n";
}
std::cout << std::endl;
std::multiset<ItemKey> counter2 {{5.0, 10.0}, {3.0, 10.0}, {7.0, 10.0}};
for (auto &item : counter2) {
std::cout << "simple_test_multiset()::counter2: ("
<< item.m_t << ", "
<< item.m_f << ")\n";
}
std::cout << std::endl;
std::cout << "simple_test_multiset()::counter2.size() = "
<< counter2.size() << std::endl;
for (auto &item : counter1) {
std::cout << "simple_test_multiset()::counter2.count({"
<< item.m_t << ", "
<< item.m_f << "}) = "
<< counter1.count(item) << std::endl;
}
std::cout << std::endl;
}
This outputs
simple_test_multiset()::counter1: (3, 10)
simple_test_multiset()::counter1: (5, 10)
simple_test_multiset()::counter1: (7, 10)
simple_test_multiset()::counter2: (5, 10)
simple_test_multiset()::counter2: (3, 10)
simple_test_multiset()::counter2: (7, 10)
simple_test_multiset()::counter2.count({3, 10}) = 2
simple_test_multiset()::counter2.count({5, 10}) = 3
simple_test_multiset()::counter2.count({7, 10}) = 2
simple_test_multiset()::counter2.size() = 3
Note that count() here returns the number of elements within the multiset that are considered equal to the ItemKey passed in. This may be advantageous for situations where you want to ask "how many of my points are within my tolerance of a new point?"
Good luck!

Related

Using Lambda Function with find_if C++20

I am getting a syntax error on the line
if (auto result = ranges::find_if(height.begi with a red squiggly line under find_if:
no instance of overloaded function matches the argument list
auto is_gt_or_eq = [height, i](int x) { height[x] >= height[i]; };
if (auto result = ranges::find_if(height.begin(), height.end(), is_gt_or_eq); result != height.end()) {
std::cout << "First larger or equal to element in height: " << *result << '\n';
}
else {
std::cout << "No larger or equal to element in height\n";
}
I found this similar code on cppreference, and it does run:
auto is_even = [](int x) { return x % 2 == 0; };
if (auto result = ranges::find_if(height.begin(), height.end(), is_even); result != height.end()) {
std::cout << "First even element in height: " << *result << '\n';
}
else {
std::cout << "No even elements in height\n";
}
I believe the error is in this line of code:
auto is_gt_or_eq = [height, i](int x) { height[x] >= height[i]; };

To start with, you forgot the return keyword for the return statement. So your lambda is returning void by default, thus not a valid predicate. The library won't allow you to call its function due to this mismatch.
Beyond that, x is (a copy of) the element of height, not an index of the container. There is no need to access the container again for the element in the lambda. So, the simplest fix is
auto is_gt_or_eq = [height, i](int x) { return x >= height[i]; };
There's also no need to constantly re-access height[i] in the lambda. It's not a bad idea to just capture that value instead.
auto is_gt_or_eq = [hi = height[i]](int x) { return x >= hi; };
Your lambda is now smaller, more inline-able and (to me at least) more readable.

Returning unique_ptr from std::for_each + lambda functions

I am facing a problem with returning std::moved unique_pointers to a lambda. Once I have moved the pointer to the lambda function how do I take the ownership back from the lambda?
In the following code I am demonstrating my problem. I have taken a snipped out from code-base and moved everything to main to explain the problem better. First question is marked as "QUESTION 1" - where I want to understand if I am correct in using (*v) for accessing my vector.
The code creates a vector of few numbers, and then iterates over the vector to mark the bits in the bitmap. I think the bits are marked correctly as I am able to print them in the lambda itself. After marking the bits, I want the ownership back. How do I do it? I need to return the bitmap pointer to the caller function.
How to take the ownership back in a STANDARD way from a lambda rather than hacking around the passed unique_ptr or avoiding moving the pointer to lambda. Does cpp17 support this?
Compile the code with g++ -std=c++17
#include <memory>
#include <algorithm>
#include <iostream>
#include <vector>
int main () {
int increments = 10;
int numberOfElements = 10;
/* Create the vector */
auto v = std::make_unique<std::vector<int>>(std::vector<int> (numberOfElements));
/* QUESTION 1 - is (*v) the right way to access it or there is a better way. */
std::generate((*v).begin(), (*v).end(), [n=0, increments]() mutable { n = n + increments; return n;});
/* Print the generated elements */
std::cout << "\nPrinting the generated elements ";
std::for_each((*v).begin(), (*v).end(), [](int n) { std::cout<<" " << n;});
/* Int find the maximum element */
int maxElement = *(std::max_element((*v).begin(), (*v).end()));
/* Making a bitmap of the elements */
std::cout << "\nPrinting the maxElement " << maxElement;
auto bitmap = std::make_unique<std::vector <bool>> (std::vector<bool>(maxElement + 1));
/* Now setting all the elements in the vector to true in the bitmap. */
for_each((*v).begin(), (*v).end(), [bmap = std::move(bitmap)](int n) {
(*bmap).at(n) = true;
if ((*bmap).at(n) == true) {
std::cout << "\nBit "<< n <<" marked";
}
});
/*******************************************************
* Question 2 : I now need the ownership of bitmap back.
* How to do it ?. Bitmap is now null after moving it to the lambda.
*/
if (bitmap) {
std::cout << "\nafter reset, ptr is not NULL ";
} else if (bitmap == nullptr) {
std::cout << "\nbitmap is null";
}
}

QUESTION 1 - is (*v) the right way to access it or there is a better way.
Alternative is using ->, so v->begin() instead of (*v).begin().
but it is strange to use unique_ptr for vector.
You might simply do:
std::vector<int> v(numberOfElements);
std::generate(v.begin(), v.end(),
[n=0, increments]() mutable { n = n + increments; return n;});
std::cout << "\nPrinting the generated elements ";
for (int e : v) { std::cout << " " << e; };
int maxElement = *std::max_element(v.begin(), v.end());
// ...
Question 2 : I now need the ownership of bitmap back. How to do it ?
You cannot do it with lambda, in your case capturing by reference do the job (even if the lambda would own the vector, the for_each "block" doesn't):
std::vector<bool> bitmap(maxElement + 1);
for_each(v.begin(), v.end(), [&bmap](int n) {
bmap.at(n) = true;
std::cout << "\nBit "<< n <<" marked";
});
assert(!bitmap.empty());
if (bitmap.empty()) {
std::cout << "\nbitmap is empty";
} else if (bitmap == nullptr) {
std::cout << "\nbitmap is NOT empty";
}
If you replace your lambda with your own functor, you might do something like
class MyMarker
{
public:
MyMarker(std::vector<bool>&& bitmap) : bitmap(std::move(bitmap)) {}
MyMarker(const MyMarker&) = delete;
MyMarker& operator =(const MyMarker&) = delete;
void operator() (int n) const
{
bitmap.at(n) = true;
std::cout << "\nBit "<< n <<" marked";
}
std::vector<bool> TakeBack() { return std::move(bitmap); }
private:
std::vector<bool> bitmap;
};
and then:
std::vector<bool> bitmap(maxElement + 1);
MyMarker myMarker(std::move(bitmap));
assert(bitmap.empty());
std::for_each(v.begin(), v.end(), myMarker);
bitmap = myMarker.TakeBack();
assert(!bitmap.empty());

float value used as a key in multimap

If compare between float, I think cannot just use equal ==, need to check if abs(a-b) < epsilon. So when float type value is used as a key, can we use equal_range function?
such as:
std::multimap<float, string> ds;
ds.insert(make_pair(2.0, string("a")));
ds.insert(make_pair(2.0, string("b")));
ds.insert(make_pair(3.0, string("d")));
ds.equal_range(2.0)

std::multimap::equal_range is not actually calculated using operator== at all. It is calculated using < AND > only. It is actually two iterators, the first being the std::multimap::lower_bound (first element not less than the given key) and the second being the std::multimap::upper_bound (first element greater than the given key).
So it is quite safe to use with floats and doubles.

Just test it and you'll see it obviously works.
#include <map>
#include <string>
#include <iostream>
int main()
{
std::multimap<float, std::string> ds;
ds.emplace(2.0f, std::string("a"));
ds.emplace(2.0f, std::string("b"));
ds.emplace(3.0f, std::string("d"));
auto r = ds.equal_range(2.0f);
for ( auto it = r.first; it != r.second; ++it )
std::cout << it->second << std::endl;
}
Output:
a
b

You can define your own less-operator for float. See the following example.
// http://www.cplusplus.com/reference/map/multimap/
#include <map>
#include <cassert>
#include <iostream>
#include <algorithm>
class CApproxFloatLess {
double m_eps;
public:
CApproxFloatLess(float eps) :
m_eps(eps)
{
assert(eps >= 0);
}
bool operator () (float x, float y) const {
return x + m_eps*(1+std::abs(x)) < y;
}
};
template <class It>
void info(float x, It& it) {
std::cout << "Found pair (" << it->first << ", " << it->second << ") for " << x << ".\n";
}
int main() {
typedef std::multimap<float,std::string,CApproxFloatLess> MyMap;
MyMap ds(CApproxFloatLess(1e-3));
ds.insert(make_pair(2.0, std::string("a")));
ds.insert(make_pair(2.0, std::string("b")));
ds.insert(make_pair(3.0, std::string("d")));
float x=2.001;
MyMap::iterator it=ds.find(x);
if( it != ds.end() )
info(x,it);
x=1.999;
it=ds.find(x);
if( it != ds.end() )
info(x,it);
x=2.01;
it=ds.find(x);
if( it != ds.end() )
info(x,it);
x=3.001;
it=ds.find(x);
if( it != ds.end() )
info(x,it);
return 0;
}
The output of this program is:
Found pair (2, a) for 2.001.
Found pair (2, a) for 1.999.
Found pair (3, d) for 3.001.

Who says you cannot use == to compare two floats? == works perfectly fine for floats; it will return true if they are equal and false if they are different. (There is some odd things about NaN's and negative zeroes, but that's not covered by a range check).
Obviously if you search for a value that isn't equal using == to any value in the multi map, it won't be found. And if you add two values that are as close to each other as possible, they will both be added.

C++ class specialiation when dealing with STL containers

I'd like a function to return the size in bytes of an object for fundamental types. I'd also like it to return the total size in bytes of an STL container. (I know this is not necessarily the size of the object in memory, and that's okay).
To this end, I've coded a memorysize namespace with a bytes function such that memorysize::bytes(double x) = 8 (on most compilers).
I've specialized it to correctly handle std::vector<double> types, but I don't want to code a different function for each class of the form std::vector<ANYTHING>, so how do I change the template to correctly handle this case?
Here's the working code:
#include <iostream>
#include <vector>
// return the size of bytes of an object (sort of...)
namespace memorysize
{
/// general object
template <class T>
size_t bytes(const T & object)
{
return sizeof(T);
}
/// specialization for a vector of doubles
template <>
size_t bytes<std::vector<double> >(const std::vector<double> & object)
{
return sizeof(std::vector<double>) + object.capacity() * bytes(object[0]);
}
/// specialization for a vector of anything???
}
int main(int argc, char ** argv)
{
// make sure it works for general objects
double x = 1.;
std::cout << "double x\n";
std::cout << "bytes(x) = " << memorysize::bytes(x) << "\n\n";
int y = 1;
std::cout << "int y\n";
std::cout << "bytes(y) = " << memorysize::bytes(y) << "\n\n";
// make sure it works for vectors of doubles
std::vector<double> doubleVec(10, 1.);
std::cout << "std::vector<double> doubleVec(10, 1.)\n";
std::cout << "bytes(doubleVec) = " << memorysize::bytes(doubleVec) << "\n\n";
// would like a new definition to make this work as expected
std::vector<int> intVec(10, 1);
std::cout << "std::vector<int> intVec(10, 1)\n";
std::cout << "bytes(intVec) = " << memorysize::bytes(intVec) << "\n\n";
return 0;
}
How do I change the template specification to allow for the more general std::vector<ANYTHING> case?
Thanks!

Modified your code accordingly:
/// specialization for a vector of anything
template < typename Anything >
size_t bytes(const std::vector< Anything > & object)
{
return sizeof(std::vector< Anything >) + object.capacity() * bytes( object[0] );
}
Note that now you have a problem if invoking bytes with an empty vector.
Edit: Scratch that. If I remember your previous question correctly, then if you get a vector of strings then you would like to take into account the size taken by each string. So instead you should do
/// specialization for a vector of anything
template < typename Anything >
size_t bytes(const std::vector< Anything > & object)
{
size_t result = sizeof(std::vector< Anything >);
foreach elem in object
result += bytes( elem );
result += ( object.capacity() - object.size() ) * sizeof( Anything ).
return result;
}

multimap accumulate values

I have a multimap defined by
typedef std::pair<int, int> comp_buf_pair; //pair<comp_t, dij>
typedef std::pair<int, comp_buf_pair> node_buf_pair;
typedef std::multimap<int, comp_buf_pair> buf_map; //key=PE, value = pair<comp_t, dij>
typedef buf_map::iterator It_buf;
int summ (int x, int y) {return x+y;}
int total_buf_size = 0;
std::cout << "\nUpdated buffer values" << std::endl;
for(It_buf it = bufsz_map.begin(); it!= bufsz_map.end(); ++it)
{
comp_buf_pair it1 = it->second;
// max buffer size will be summ(it1.second)
//total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), &summ); //error??
std::cout << "Total buffers required for this config = " << total_buf_size << std::endl;
std::cout << it->first << " : " << it1.first << " : " << it1.second << std::endl;
}
I would like to sum all the values pointed by it1.second
How can the std::accumulate function access the second iterator values?

Your issue is with the summ function, you actually need something better than that to be able to handle 2 mismatched types.
If you're lucky, this could work:
int summ(int x, buf_map::value_type const& v) { return x + v.second; }
If you're unlucky (depending on how accumulate is implemented), you could always:
struct Summer
{
typedef buf_map::value_type const& s_type;
int operator()(int x, s_type v) const { return x + v.second.first; }
int operator()(s_type v, int x) const { return x + v.second.first; }
};
And then use:
int result = std::accumulate(map.begin(), map.end(), 0, Summer());

I think you'll just need to change your summ function to take the map value_type instead. This is totally untested but it should give the idea.
int summ (int x, const buf_map::value_type& y)
{
return x + y.second;
}
And call it:
total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), 0, &summ);

Why do you mess about with pairs containing pairs? It is too complicated and you'll wind up making errors. Why not define a struct?

Accumulate is a generalization of summation: it computes the sum (or some other binary operation) of init and all of the elements in the range [first, last).
... The result is first initialized to init. Then, for each iterator i in [first, last), in order from beginning to end, it is updated by result = result + *i (in the first version) or result = binary_op(result, *i) (in the second version).
Sgi.com
Your attempt was neither first or second version, you missed the init part
total_buf_size = std::accumulate(bufsz_map.begin(), bufsz_map.end(), 0, &summ);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to create a map with custom class/comparator as key - c++

Related

Using Lambda Function with find_if C++20

Returning unique_ptr from std::for_each + lambda functions

float value used as a key in multimap

C++ class specialiation when dealing with STL containers

multimap accumulate values

Categories

Resources