Boost Intrusive Hashtable - c++

Can anyone provide a simple example of how to use the Boost Intrusive Hashtable? I've tried to implement it, but I'm having little luck.
I have this so far
void HashTableIndex::addToIndex(Message* message)
{
hashtable<MyMessageVector>::bucket_type base_buckets[10000];
hashtable<MyMessageVector> htable(hashtable<MyMessageVector>::bucket_traits(base_buckets, 10000));
boost::array<MyMessageVector,10000> items;
htable.insert_unique(items[0]);
but for some reason it's not calling my Hash function which is defined above like this
size_t HashTableIndex::hash_value(MyMessageVector& b)
{
boost::hash<string> hasher;
return hasher(b.getKey());
};
For some reason it won't call my hash_value function. Any help on this would be much appreciated!

You can supply the hash function to the hash table using boost::intrusive::hash in the list of options.

You are using a member function and boost::hash requires a free function. See boost::hash documentation:
namespace library
{
std::size_t hash_value(book const& b)
{
boost::hash<int> hasher;
return hasher(b.id);
}
}
You can also use a "friend" function declared in the class as shown in the Boost.Intrusive unordered_set documentation:
class MyClass
{
//...
public:
friend bool operator== (const MyClass &a, const MyClass &b)
{ return a.int_ == b.int_; }
friend std::size_t hash_value(const MyClass &value)
{ return std::size_t(value.int_); }
};

Related

Defining a proxy-based OutputIterator in terms of boost::iterator_facade

I wrote this C++17 code and expected it to work out of the box.
class putc_iterator : public boost::iterator_facade<
putc_iterator,
void,
std::output_iterator_tag
>
{
friend class boost::iterator_core_access;
struct proxy {
void operator= (char ch) { putc(ch, stdout); }
};
auto dereference() const { return proxy{}; }
void increment() {}
bool equal(const putc_iterator&) const { return false; }
};
I'm trying to match the behavior of all the standard OutputIterators by setting my iterator's member typedefs value_type and reference to void (since those types are meaningless for an iterator whose operator* doesn't return a reference).
However, Boost complains:
In file included from prog.cc:2:
/opt/wandbox/boost-1.63.0/clang-head/include/boost/iterator/iterator_facade.hpp:333:50: error: cannot form a reference to 'void'
static result_type apply(Reference const & x)
^
It looks like Boost is trying to hard-code the generated operator*'s signature as reference operator*() const. That is, boost::iterator_facade could deduce the proper return type of operator*() by simply passing along whatever was returned by dereference(); but for some reason it's just not playing along.
What's the solution? I can't pass proxy as a template parameter of the base class since proxy hasn't been defined yet. I could pull proxy out into a detail namespace:
namespace detail {
struct proxy {
void operator= (char ch) { putc(ch, stdout); }
};
}
class putc_iterator : public boost::iterator_facade<
putc_iterator,
void,
std::output_iterator_tag,
detail::proxy
>
{
friend class boost::iterator_core_access;
auto dereference() const { return detail::proxy{}; }
void increment() {}
bool equal(const putc_iterator&) const { return false; }
};
but that seems awkward and is definitely something that "shouldn't be necessary."
Is this a bug in iterator_facade? Is it a feature-not-a-bug? If the latter, then how am I supposed to use it to create OutputIterators?
Also, a minor nitpick: even my workaround with the detail namespace is "wrong" in the sense that it makes std::is_same_v<putc_iterator::reference, detail::proxy> when what I want (for parity with the standard iterators) is std::is_same_v<putc_iterator::reference, void>.
Boost Iterator Facade was good at the time, but now it is outdated as it is not very flexible (it doesn't play well with auto and with r-value references that in principle can be creating by dereferencing a r-value iterator). I am not againts the facade concept, but it could be upgraded to C++11.
In addition now with C++11 is easier to write iterator from scratch.
Anyway, if you need to define a reference just to comply with the arguments to be passed, (and if you promise not use it) you can use void* instead of void. (Or perhaps for consistency use proxy& and define it outside the class).
class putc_iterator : public boost::iterator_facade<
putc_iterator,
void*,
std::output_iterator_tag
>
{
friend class boost::iterator_core_access;
struct proxy {
void operator= (char ch) { putc(ch, stdout); }
};
auto dereference() const { return proxy{}; }
void increment() {}
bool equal(const putc_iterator&) const { return false; }
};

Using lambda instead of a function object, bad performance

My problem is pretty simple, i want to use lambda's in the same way i may use a functor as a 'comparator', let me explain a little better. I have two big structs, both of them have their own implementation of operator<, and i have also a useless class (this is just the name of the class in the context of this question) which use the two struct, everything looks like this:
struct be_less
{
//A lot of stuff
int val;
be_less(int p_v):val(p_v){}
bool operator<(const be_less& p_other) const
{
return val < p_other.val;
}
};
struct be_more
{
//A lot of stuff
int val;
be_more(int p_v):val(p_v){}
bool operator<(const be_more& p_other) const
{
return val > p_other.val;
}
};
class useless
{
priority_queue<be_less> less_q;
priority_queue<be_more> more_q;
public:
useless(const vector<int>& p_data)
{
for(auto elem:p_data)
{
less_q.emplace(elem);
more_q.emplace(elem);
}
}
};
I whould like to remove the duplication in the two struct's, the simpliest idea is to make the struct a template and provide two functor to do the comparison job:
template<typename Comp>
struct be_all
{
//Lot of stuff, better do not duplicate
int val;
be_all(int p_v):val{p_v}{}
bool operator<(const be_all<Comp>& p_other) const
{
return Comp()(val,p_other.val);
}
};
class comp_less
{
public:
bool operator()(int p_first,
int p_second)
{
return p_first < p_second;
}
};
class comp_more
{
public:
bool operator()(int p_first,
int p_second)
{
return p_first > p_second;
}
};
typedef be_all<comp_less> all_less;
typedef be_all<comp_more> all_more;
class useless
{
priority_queue<all_less> less_q;
priority_queue<all_more> more_q;
public:
useless(const vector<int>& p_data)
{
for(auto elem:p_data)
{
less_q.emplace(elem);
more_q.emplace(elem);
}
}
};
This work pretty well, now for sure i dont have any duplication in the struct code at the price of two additional function object. Please note that i'm very simplifying the implementation of operator<, the hipotetic real code does much more than just comparing two ints.
Then i was thinking about how to do the same thing using lambda (Just as an experiment).The only working solution i was able to implement is:
template<typename Comp>
struct be_all
{
int val;
function<bool(int,int)> Comparator;
be_all(Comp p_comp,int p_v):
Comparator(move(p_comp)),
val{p_v}
{}
bool operator<(const be_all& p_other) const
{
return Comparator(val, p_other.val);
}
};
auto be_less = [](int p_first,
int p_second)
{
return p_first < p_second;
};
auto be_more = [](int p_first,
int p_second)
{
return p_first > p_second;
};
typedef be_all<decltype(be_less)> all_less;
typedef be_all<decltype(be_more)> all_more;
class useless
{
priority_queue<all_less> less_q;
priority_queue<all_more> more_q;
public:
useless(const vector<int>& p_data)
{
for(auto elem:p_data)
{
less_q.emplace(be_less,elem);
more_q.emplace(be_more,elem);
}
}
};
This implementation not only add a new member to the data containing struct, but have also a very poor performance, i prepared a small test in which i create one instance for all the useless class i've show you here, every time i feed the constructor with a vector full of 2 milion integers, the results are the following:
Takes 48ms to execute the constructor of the first useless class
Takes 228ms to create the second useless class (functor)
Takes 557ms to create the third useless class (lambdas)
Clearly the price i pay for the removed duplication is very high, and in the original code the duplication is still there. Please note how bad is the performance of the third implementation, ten times slower that the original one, i believed that the reason of the third implementation being slower than the second was because of the additional parameter in the constructor of be_all... but:
Actually there's also a fourth case, where i still used the lambda but i get rid of the Comparator member and of the additional parameter in be_all, the code is the following:
template<typename Comp>
struct be_all
{
int val;
be_all(int p_v):val{p_v}
{}
bool operator<(const be_all& p_other) const
{
return Comp(val, p_other.val);
}
};
bool be_less = [](int p_first,
int p_second)
{
return p_first < p_second;
};
bool be_more = [](int p_first,
int p_second)
{
return p_first > p_second;
};
typedef be_all<decltype(be_less)> all_less;
typedef be_all<decltype(be_more)> all_more;
class useless
{
priority_queue<all_less> less_q;
priority_queue<all_more> more_q;
public:
useless(const vector<int>& p_data)
{
for(auto elem:p_data)
{
less_q.emplace(elem);
more_q.emplace(elem);
}
}
};
If i remove auto from the lambda and use bool instead the code build even if i use Comp(val, p_other.val) in operator<.
What's very strange to me is that this fourth implementation (lambda without the Comparator member) is even slower than the other, at the end the average performance i was able to register are the following:
48ms
228ms
557ms
698ms
Why the functor are so much faster than lambdas in this scenario? I was expecting lambda's to be at least performing good as the ordinary functor, can someone of you comment please? And is there any technial reason why the fourth implementation is slower than the third?
PS:
The compilator i'm using is g++4.8.2 with -O3. In my test i create for each useless class an instance and using chrono i take account of the required time:
namespace benchmark
{
template<typename T>
long run()
{
auto start=chrono::high_resolution_clock::now();
T t(data::plenty_of_data);
auto stop=chrono::high_resolution_clock::now();
return chrono::duration_cast<chrono::milliseconds>(stop-start).count();
}
}
and:
cout<<"Bad code: "<<benchmark::run<bad_code::useless>()<<"ms\n";
cout<<"Bad code2: "<<benchmark::run<bad_code2::useless>()<<"ms\n";
cout<<"Bad code3: "<<benchmark::run<bad_code3::useless>()<<"ms\n";
cout<<"Bad code4: "<<benchmark::run<bad_code4::useless>()<<"ms\n";
The set of input integers is the same for all, plenty_of_data is a vector full of 2 million intergers.
Thanks for your time
You are not comparing the runtime of a lambda and a functor. Instead, the numbers indicate the difference in using a functor and an std::function. And std::function<R(Args...)>, for example, can store any Callable satisfying the signature R(Args...). It does this through type-erasure. So, the difference you see comes from the overhead of a virtual call in std::function::operator().
For example, the libc++ implementation(3.5) has a base class template<class _Fp, class _Alloc, class _Rp, class ..._ArgTypes> __base with a virtual operator(). std::function stores a __base<...>*. Whenever you create an std::function with a callable F, an object of type template<class F, class _Alloc, class R, class ...Args> class __func is created, which inherits from __base<...> and overrides the virtual operator().

My class has a toString() method, how do I use this for hashing in std::unordered_set?

MyClass defines operator== and has a non-trivial internal state, but it does provide a wstring toString() method, which returns a serialized version of that state. So I thought it would be easy just to use toString() with hash<wstring> on std::unordered_set.
But is it possible to do this in a nice neat way without defining extraneous functor classes? I'm only just getting to grips with C++11 after moving to VS2013 and I thought this was one of the big steps forward, being able to define such things as lambdas?
Thanks for any suggestions how best to do this.
auto hasher = [](const MyClass &m){ return std::hash<std::wstring>()(m.toString()); };
std::unordered_set<MyClass, decltype(hasher)> set(10, hasher);
Unfortunately this doesn't currently work with MSVC due to a bug.
Possible workarounds include writing a specialization of std::hash for MyClass, or storing the lambda in a std::function<std::size_t(const MyClass &)> and use that as the hasher's type:
std::function<std::size_t(const MyClass &)> hasher =
[](const MyClass &m) { return std::hash<std::wstring>()(m.toString()); };
std::unordered_set<MyClass, std::function<std::size_t(const MyClass &)>> set(10, hasher);
The best approach is to tell std::hash how to hash MyClass via specialization:
namespace std {
template <>
struct hash<MyClass> {
std::size_t operator () (const MyClass& mc) const {
return std::hash<std::wstring>()(mc.toString());
}
};
} // namespace std
so you don't need to bother with non-default template parameters for unordered_set or unordered_map.

Is it possible to set default constructor to `std::map<T1, T2>` values?

So I want to create a simple map std::map<T1, std::string> and I have a function that returns std::string I want somehow to link item creation in std::map with my function so that when my_map[some_new_element] is called my function will be called and its return set to value for some_new_element key. Is such thing possible and how to do it?
You can wrap the map itself or the value type or operator[].
Last wrapper will be the simplest:
template <typename T>
std::string& get_default(std::map<T, std::string>& map, const T& key) {
auto it = map.find(key);
if (it == map.end()) {
return map[key] = create_default_value();
} else {
return *it;
}
}
The value type shouldn't be too hard, either:
struct default_string {
std::string wrapped_string;
default_string() : wrapped_string(create_default_value()) {}
explicit default_string(const std::string& wrapped_string)
: wrapped_string(wrapped_string) {}
operator const std::string&() const { return wrapped_string; }
operator std::string&() { return wrapped_string; }
};
Wrapping map will take a bit more work, as you'd have to duplicate the entire interface, including typedefs. Note: this code is not tested, treat it as proof-of-concept, to steer you in the right direction.
What about a small wrapper class for std::string?
class StringWrapper {
StringWrapper() { //... your code
}
operator std::string&() { return m_string; } // or something like that
private:
std::string m_string;
};
Now you use the following map-type:
std::map<T1, StringWrapper> mymap;
In the constructor of StringWrapper you can define custom actions. It gets called when you insert an item into your map.

Extension methods in c++

I was searching for an implementation of extension methods in c++ and came upon this comp.std.c++ discussion which mentions that polymorphic_map can be used to associated methods with a class, but, the provided link seems to be dead. Does anyone know what that answer was referring to, or if there is another way to extend classes in a similar manner to extension methods (perhaps through some usage of mixins?).
I know the canonical C++ solution is to use free functions; this is more out of curiosity than anything else.
Different languages approach development in different ways. In particular C# and Java have a strong point of view with respect to OO that leads to everything is an object mindset (C# is a little more lax here). In that approach, extension methods provide a simple way of extending an existing object or interface to add new features.
There are no extension methods in C++, nor are they needed. When developing C++, forget the everything is an object paradigm --which, by the way, is false even in Java/C# [*]. A different mindset is taken in C++, there are objects, and the objects have operations that are inherently part of the object, but there are also other operations that form part of the interface and need not be part of the class. A must read by Herb Sutter is What's In a Class?, where the author defends (and I agree) that you can easily extend any given class with simple free functions.
As a particular simple example, the standard templated class basic_ostream has a few member methods to dump the contents of some primitive types, and then it is enhanced with (also templated) free functions that extend that functionality to other types by using the existing public interface. For example, std::cout << 1; is implemented as a member function, while std::cout << "Hi"; is a free function implemented in terms of other more basic members.
Extensibility in C++ is achieved by means of free functions, not by ways of adding new methods to existing objects.
[*] Everything is not an object.
In a given domain will contain a set of actual objects that can be modeled and operations that can be applied to them, in some cases those operations will be part of the object, but in some other cases they will not. In particular you will find utility classes in the languages that claim that everything is an object and those utility classes are nothing but a layer trying to hide the fact that those methods don't belong to any particular object.
Even some operations that are implemented as member functions are not really operations on the object. Consider addition for a Complex number class, how is sum (or +) more of an operation on the first argument than the second? Why a.sum(b); or b.sum(a), should it not be sum( a, b )?
Forcing the operations to be member methods actually produces weird effects --but we are just used to them: a.equals(b); and b.equals(a); might have completely different results even if the implementation of equals is fully symmetric. (Consider what happens when either a or b is a null pointer)
Boost Range Library's approach use operator|().
r | filtered(p);
I can write trim for string as follows in the same way, too.
#include <string>
namespace string_extension {
struct trim_t {
std::string operator()(const std::string& s) const
{
...
return s;
}
};
const trim_t trim = {};
std::string operator|(const std::string& s, trim_t f)
{
return f(s);
}
} // namespace string_extension
int main()
{
const std::string s = " abc ";
const std::string result = s | string_extension::trim;
}
This is the closest thing that I have ever seen to extension methods in C++. Personally i like the way it can be used, and possibly this it the closest we can get to extension methods in this language. But there are some disadvantages:
It may be complicated to implement
Operator precedence may be not that nice some times, this may cause surprises
A solution:
#include <iostream>
using namespace std;
class regular_class {
public:
void simple_method(void) const {
cout << "simple_method called." << endl;
}
};
class ext_method {
private:
// arguments of the extension method
int x_;
public:
// arguments get initialized here
ext_method(int x) : x_(x) {
}
// just a dummy overload to return a reference to itself
ext_method& operator-(void) {
return *this;
}
// extension method body is implemented here. The return type of this op. overload
// should be the return type of the extension method
friend const regular_class& operator<(const regular_class& obj, const ext_method& mthd) {
cout << "Extension method called with: " << mthd.x_ << " on " << &obj << endl;
return obj;
}
};
int main()
{
regular_class obj;
cout << "regular_class object at: " << &obj << endl;
obj.simple_method();
obj<-ext_method(3)<-ext_method(8);
return 0;
}
This is not my personal invention, recently a friend of mine mailed it to me, he said he got it from a university mailing list.
The short answer is that you cannot do that. The long answer is that you can simulate it, but be aware that you'll have to create a lot of code as workaround (actually, I don't think there is an elegant solution).
In the discussion, a very complex workaround is provided using operator- (which is a bad idea, in my opinion). I guess that the solution provided in the dead link was more o less similar (since it was based on operator|).
This is based in the capability of being able to do more or less the same thing as an extension method with operators. For example, if you want to overload the ostream's operator<< for your new class Foo, you could do:
class Foo {
friend ostream &operator<<(ostream &o, const Foo &foo);
// more things...
};
ostream &operator<<(ostream &o, const Foo &foo)
{
// write foo's info to o
}
As I said, this is the only similar mechanism availabe in C++ for extension methods. If you can naturally translate your function to an overloaded operator, then it is fine. The only other possibility is to artificially overload an operator that has nothing to do with your objective, but this is going to make you write very confusing code.
The most similar approach I can think of would mean to create an extension class and create your new methods there. Unfortunately, this means that you'll need to "adapt" your objects:
class stringext {
public:
stringext(std::string &s) : str( &s )
{}
string trim()
{ ...; return *str; }
private:
string * str;
};
And then, when you want to do that things:
void fie(string &str)
{
// ...
cout << stringext( str ).trim() << endl;
}
As said, this is not perfect, and I don't think that kind of perfect solution exists.
Sorry.
To elaborate more on #Akira answer, operator| can be used to extend existing classes with functions that take parameters too. Here an example that I'm using to extend Xerces XML library with find functionalities that can be easily concatenated:
#pragma once
#include <string>
#include <stdexcept>
#include <xercesc/dom/DOMElement.hpp>
#define _U16C // macro that converts string to char16_t array
XERCES_CPP_NAMESPACE_BEGIN
struct FindFirst
{
FindFirst(const std::string& name);
DOMElement * operator()(const DOMElement &el) const;
DOMElement * operator()(const DOMElement *el) const;
private:
std::string m_name;
};
struct FindFirstExisting
{
FindFirstExisting(const std::string& name);
DOMElement & operator()(const DOMElement &el) const;
private:
std::string m_name;
};
inline DOMElement & operator|(const DOMElement &el, const FindFirstExisting &f)
{
return f(el);
}
inline DOMElement * operator|(const DOMElement &el, const FindFirst &f)
{
return f(el);
}
inline DOMElement * operator|(const DOMElement *el, const FindFirst &f)
{
return f(el);
}
inline FindFirst::FindFirst(const std::string & name)
: m_name(name)
{
}
inline DOMElement * FindFirst::operator()(const DOMElement &el) const
{
auto list = el.getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
return nullptr;
return static_cast<DOMElement *>(list->item(0));
}
inline DOMElement * FindFirst::operator()(const DOMElement *el) const
{
if (el == nullptr)
return nullptr;
auto list = el->getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
return nullptr;
return static_cast<DOMElement *>(list->item(0));
}
inline FindFirstExisting::FindFirstExisting(const std::string & name)
: m_name(name)
{
}
inline DOMElement & FindFirstExisting::operator()(const DOMElement & el) const
{
auto list = el.getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
throw runtime_error(string("Missing element with name ") + m_name);
return static_cast<DOMElement &>(*list->item(0));
}
XERCES_CPP_NAMESPACE_END
It can be used this way:
auto packetRate = *elementRoot | FindFirst("Header") | FindFirst("PacketRate");
auto &decrypted = *elementRoot | FindFirstExisting("Header") | FindFirstExisting("Decrypted");
You can enable kinda extension methods for your own class/struct or for some specific type in some scope. See rough solution below.
class Extensible
{
public:
template<class TRes, class T, class... Args>
std::function<TRes(Args...)> operator|
(std::function<TRes(T&, Args...)>& extension)
{
return [this, &extension](Args... args) -> TRes
{
return extension(*static_cast<T*>(this), std::forward<Args>(args)...);
};
}
};
Then inherit your class from this and use like
class SomeExtensible : public Extensible { /*...*/ };
std::function<int(SomeExtensible&, int)> fn;
SomeExtensible se;
int i = (se | fn)(4);
Or you can declare this operator in cpp file or namespace.
//for std::string, for example
template<class TRes, class... Args>
std::function<TRes(Args...)> operator|
(std::string& s, std::function<TRes(std::string&, Args...)>& extension)
{
return [&s, &extension](Args... args) -> TRes
{
return extension(s, std::forward<Args>(args)...);
};
}
std::string s = "newStr";
std::function<std::string(std::string&)> init = [](std::string& s) {
return s = "initialized";
};
(s | init)();
Or even wrap it in macro (I know, it's generally bad idea, nevertheless you can):
#define ENABLE_EXTENSIONS_FOR(x) \
template<class TRes, class... Args> \
std::function<TRes(Args...)> operator| (x s, std::function<TRes(x, Args...)>& extension) \
{ \
return [&s, &extension](Args... args) -> TRes \
{ \
return extension(s, std::forward<Args>(args)...); \
}; \
}
ENABLE_EXTENSIONS_FOR(std::vector<int>&);
This syntactic sugar isn't available in C++, but you can define your own namespace and write pure static classes, using const references as the first parameter.
For example, I was struggling using the STL implementation for some array operations, and I didn't like the syntaxis, I was used to JavaScript's functional way of how array methods worked.
So, I made my own namespace wh with the class vector in it, since that's the class I was expecting to use these methods, and this is the result:
//#ifndef __WH_HPP
//#define __WH_HPP
#include <vector>
#include <functional>
#include <algorithm>
namespace wh{
template<typename T>
class vector{
public:
static T reduce(const std::vector<T> &array, const T &accumulatorInitiator, const std::function<T(T,T)> &functor){
T accumulator = accumulatorInitiator;
for(auto &element: array) accumulator = functor(element, accumulator);
return accumulator;
}
static T reduce(const std::vector<T> &array, const T &accumulatorInitiator){
return wh::vector<T>::reduce(array, accumulatorInitiator, [](T element, T acc){return element + acc;});
}
static std::vector<T> map(const std::vector<T> &array, const std::function<T(T)> &functor){
std::vector<T> ret;
transform(array.begin(), array.end(), std::back_inserter(ret), functor);
return ret;
}
static std::vector<T> filter(const std::vector<T> &array, const std::function<bool(T)> &functor){
std::vector<T> ret;
copy_if(array.begin(), array.end(), std::back_inserter(ret), functor);
return ret;
}
static bool all(const std::vector<T> &array, const std::function<bool(T)> &functor){
return all_of(array.begin(), array.end(), functor);
}
static bool any(const std::vector<T> &array, const std::function<bool(T)> &functor){
return any_of(array.begin(), array.end(), functor);
}
};
}
//#undef __WH_HPP
I wouldn't inherit nor compose a class with it, since I've never been able to do it peacefully without any side-effects, but I came up with this, just const references.
The problem of course, is the extremely verbose code you have to make in order to use these static methods:
int main()
{
vector<int> numbers = {1,2,3,4,5,6};
numbers = wh::vector<int>::filter(numbers, [](int number){return number < 3;});
numbers = wh::vector<int>::map(numbers,[](int number){return number + 3;});
for(const auto& number: numbers) cout << number << endl;
return 0;
}
If only there was syntactic sugar that could make my static methods have some kind of more common syntax like:
myvector.map([](int number){return number+2;}); //...