Boost::multi_index with map - c++

I have a question about modifying elements in boost::multi_index container.
What I have is the structure, containing some pre-defined parameters and
a number of parameters, which are defined at run-time, and stored in a map.
Here is a simplified version of the structure:
class Sdata{
QMap<ParamName, Param> params; // parameters defined at run-time
public:
int num;
QString key;
// more pre-defined parameters
// methods to modify the map
// as an example - mock version of a function to add the parameter
// there are more functions operating on the QMAP<...>, which follow the same
// rule - return true if they operated successfully, false otherwise.
bool add_param(ParamName name, Param value){
if (params.contains(name)) return false;
params.insert(name, value);
return true;
}
};
Now, I want to iterate over different combinations of the pre-defined parameters
of Sdata. To do this, I went for boost::multi_index:
typedef multi_index_container<Sdata,
indexed_by <
// by insertion order
random_access<>,
//by key
hashed_unique<
tag<sdata_tags::byKey>,
const_mem_fun<Sdata, SdataKey, &Sdata::get_key>
>,
//by TS
ordered_non_unique<
tag<sdata_tags::byTS>,
const_mem_fun<Sdata, TS, &Sdata::get_ts>
>,
/// more keys and composite-keys
>//end indexed by
> SdataDB;
And now, I want to access and modify the parameters inside the QMap<...>.
Q1 Do I get it correctly that to modify any field (even those unrelated to
the index), one needs to use functors and do something as below?
Sdatas_byKey const &l = sdatas.get<sdata_tags::byKey>();
auto it = l.find(key);
l.modify(it, Functor(...))
Q2 How to get the result of the method using the functor? I.e., I have a functor:
struct SdataRemoveParam : public std::unary_function<Sdata, void>{
ParamName name;
SdataRemoveParam(ParamName h): name(h){}
void operator ()(Sdata &sdata){
sdata.remove_param (name); // this returns false if there is no param
}
};
How to know if the remove_param returned true or false in this example:
Sdatas_byKey const &l = sdatas.get<sdata_tags::byKey>();
auto it = l.find(key);
l.modify(it, SdataRemoveParam("myname"));
What I've arrived to so far is to throw an exception, so that the modify
method of boost::multi_index, when using with Rollback functor will return
false:
struct SdataRemoveParam : public std::unary_function<Sdata, void>{
ParamName name;
SdataRemoveParam(ParamName h): name(h){}
void operator ()(Sdata &sdata){
if (!sdata.remove_param (name)) throw std::exception("Remove failed");
}
};
// in some other place
Sdatas_byKey const &l = sdatas.get<sdata_tags::byKey>();
auto it = l.find(key);
bool res = l.modify(it, SdataRemoveParam("myname"), Rollback);
However, I do not like the decision, because it increases the risk of deleting
the entry from the container.
Q3 are there any better solutions?

Q1 Do I get it correctly that to modify any field (even those
unrelated to the index), one needs to use functors and do something as
below?
Short answer is yes, use modify for safety. If you're absolutely sure that the data you modify does not belong to any index, then you can get by with an ugly cast:
const_cast<Sdata&>(*it).remove_param("myname");
but this is strongly discouraged. With C++11 (which you seem to be using), you can use lambdas rather than cumbersome user-defined functors:
Sdatas_byKey &l = sdatas.get<sdata_tags::byKey>(); // note, this can't be const
auto it = l.find(key);
l.modify(it, [](Sdata& s){
s.remove_param("myname");
});
Q2 How to get the result of the method using the functor?
Again, with lambdas this is very simple:
bool res;
l.modify(it, [&](Sdata& s){
res=s.remove_param("myname");
});
With functors you can do the same but it requires more boilerplate (basically, have SdataRemoveParam store a pointer to res).
The following is just for fun: if you're using C++14 you can encapsulate the whole idiom very tersely like this (C++11 would be slightly harder):
template<typename Index,typename Iterator,typename F>
auto modify_inner_result(Index& i,Iterator it,F f)
{
decltype(f(std::declval<typename Index::value_type&>())) res;
i.modify(it,[&](auto& x){res=f(x);});
return res;
}
...
bool res=modify_inner_result(l,it, [&](Sdata& s){
return s.remove_param("myname");
});

Related

How can I return unordered_map with a hash function to something expecting just `unordered_map<string, string>`

I want to implement an unordered_map<string, string> that ignores case in the keys. My code looks like:
std::unordered_map<std::string, std::string> noCaseMap()
{
struct hasher {
std::size_t operator()(const std::string& key) const {
return std::hash<std::string>{}(toLower(key));
}
};
std::unordered_map<std::string, std::string, hasher> ret;
return ret;
}
but XCode flags the return statement with this error:
foo.cpp:181:20 No viable conversion from returned value of type 'unordered_map<[2 * ...], hasher>' to function return type 'unordered_map<[2 * ...], (default) std::hash<std::string>>'
I tried casting ret to <std::unordered_map<std::string, std::string>>, but XCode wasn't having it.
I tried making my hasher a subclass of std::hash<std::string> but that made no difference.
Edit: this is a slight oversimplification of the problem; I know I also have to implement a case-insensitive equal_to() functor as well.
You can't. There's a reason it's part of the type: efficiency. What you can do is e.g. store everything lowercase. If you need both lowercase and case-preserving, you might need two maps; but, at this point, I'd consider requesting an interface change.

Generic way to use fs::recursive_directory_iterator() and fs::directory_iterator()

I need to iterate over folder either recursively or not (given the boolean parameter). I have discovered there is fs::recursive_directory_iterator() and also fs::directory_iterator(). In Java, I would expect them to implement the same interface or share the common ancestor so that I could substitute the needed one. But for some reason the two iterators do not share the common ancestor, forcing the to write the code like:
if (recursive_) {
path = recursive_iterator_->path();
recursive_iterator_++;
} else {
path = plain_iterator_->path();
plain_iterator_++;
}
I cannot believe this is how it is supposed to work. I also initially assumed there are some options to turn off recursion for recursive_directory_iterator but seems no any between std::filesystem::directory_options.
The value is not known at the compile time. I think it should be possible to use something like a closure or even subclass with virtual method but looks a bit like overkill.
Should I simply use conditionals switching between the two iterators as needed, or there are better approaches?
implement the same interface
They do. They are both InputIterators, that dereference to const std::filesystem::directory_entry&.
C++ avoids virtual by default.
You can use boost::any_range to type erase the recursiveness.
template <typename... Args>
auto make_directory_range(bool recursive, Args... args) {
return recursive
? boost::make_iterator_range(fs::recursive_directory_iterator(args...), fs::recursive_directory_iterator()) | boost::adaptors::type_erased()
: boost::make_iterator_range(fs::directory_iterator(args...), fs::directory_iterator());
}
using iterator_t = decltype(make_directory_range(true).begin());
auto range = make_directory_range(recursive_, args...);
iterator_t iterator = range.begin();
iterator_t end = range.end();
The usual way of dealing with a static polymorphism situation like this is to use a helper template:
template<class F,class ...AA>
void for_each_file(F f,bool rec,AA &&...aa) {
const auto g=[&](auto end) {
std::for_each(decltype(end)(std::forward<AA>(aa)...),
end,std::move(f));
};
if(rec) g(fs::recursive_directory_iterator());
else g(fs::directory_iterator());
}
std::size_t count(const fs::path &d,bool rec) {
std::size_t n=0;
for_each_file([&](fs::directory_entry) {++n;},rec,d);
return n;
}
This approach does have limitations: it makes it harder to break out of the “loop”, for example.

Constructing a tuple from values returned by member functions of objects inside another tuple

(This could be an XY Problem, so I'm providing some background information prior to the actual question.)
Background
I currently have a function (not a template) that computes different hash types (CRC32, MD5, SHA1, etc.) The data comes from a provider that can only provide a pointer to a chunk of the data at a time. The function computes the hashes on chunks of data iteratively.
Advancing to the next chunk is a very costly operation (involves decompression) and it can only go forward. Also the architecture must be kept zero-copy. As a result, all the selected hashes must be computed at once while iterating on the same chunks of data. Hash type selection is done through bool parameters:
std::tuple<uint32_t, QByteArray, QByteArray, QByteArray>
computeHashes(DataProvider& data, bool do_crc, bool do_md5, bool do_sha1,
bool do_sha256);
If one of the flags is false, the caller ignores the corresponding empty tuple element.
Actual Question
I am very unhappy with the above API. So I decided to write a cleaner looking function template. No boolean switches and no dummy tuple elements in the return value:
auto [crc, sha256] = computeHashes<Hash::CRC32, Hash::MD5>(data_provider);
I got the code mostly working, except for the last step where I need to actually return the results. This is trimmed down from the real code, and with only two hash functions in order to keep the example as short as possible:
enum class Hash { CRC32, MD5 };
template <HashType> struct Hasher
{};
template<> struct Hasher<HashType::CRC32>
{
void addData(const char* data, int len);
uint32_t result() const;
};
template<> struct Hasher<HashType::MD5>
{
void addData(const char* data, int len);
QByteArray result() const;
};
template <HashType... hash_types>
auto computeHashes(DataProvider& provider)
{
std::tuple<Hasher<hash_types>...> hashers;
while (provider.hasMoreChunks()) {
auto [chunk, len] = provider.nextChunk();
std::apply([chunk, len](auto&... hasher)
{ (..., hasher.addData(chunk, len); },
hashers);
}
return std::make_tuple( ??? );
}
I'm stuck at the last step: how do I return each result? A hard-coded return would look this:
return std::make_tuple(res, std::get<0>(hashers).result(),
std::get<1>(hashers).result());
This isn't suitable of course. How do I do this?
since std::apply forwards returned values by decltype(auto) you can just construct a tuple with std::apply and return it.
This can be coalesced with your transformations into one call.
template <HashType... hash_types>
static auto computeHashes(DataProvider& provider)
{
return std::apply(
[&provider](auto&&... hashers) {
while (provider.hasMoreChunks())
{
auto [chunk, len] = provider.nextChunk();
(..., hashers.addData(chunk, len));
}
return std::make_tuple(std::move(hashers.result())...);
},
std::tuple<Hasher<hash_types>...>{}
);
}

Is there a way to make a function have different behavior if its return value will be used as an rvalue reference instead of an lvalue?

I have a routine that does some moderately expensive operations, and the client could consume the result as either a string, integer, or a number of other data types. I have a public data type that is a wrapper around an internal data type. My public class looks something like this:
class Result {
public:
static Result compute(/* args */) {
Result result;
result.fData = new ExpensiveInternalObject(/* args */);
return result;
}
// ... constructors, destructor, assignment operators ...
std::string toString() const { return fData->toString(); }
int32_t toInteger() const { return fData->toInteger(); }
double toDouble() const { return fData->toDouble(); }
private:
ExpensiveInternalObject* fData;
}
If you want the string, you can use it like this:
// Example A
std::string resultString = Result::compute(/*...*/).toString();
If you want more than one of the return types, you do it like this:
// Example B
Result result = Result::compute(/*...*/);
std::string resultString = result.toString();
int32_t resultInteger = result.toInteger();
Everything works.
However, I want to modify this class such that there is no need to allocate memory on the heap if the user needs only one of the result types. For example, I want Example A to essentially do the equivalent of,
auto result = ExpensiveInternalObject(/* args */);
std::string resultString = result.toString();
I've thought about structuring the code such that the args are saved into the instance of Result, make the ExpensiveInternalObject not be calculated until the terminal functions (toString/toInteger/toDouble), and overload the terminal functions with rvalue reference qualifiers, like this:
class Result {
// ...
std::string toString() const & {
if (fData == nullptr) {
const_cast<Result*>(this)->fData = new ExpensiveInternalObject(/*...*/);
}
return fData->toString();
}
std::string toString() && {
auto result = ExpensiveInternalObject(/*...*/);
return result.toString();
}
// ...
}
Although this avoids the heap allocation for the Example A call site, the problem with this approach is that you have to start thinking about thread safety issues. You'd probably want to make fData an std::atomic, which adds overhead to the Example B call site.
Another option would be to make two versions of compute() under different names, one for the Example A use case and one for the Example B use case, but this isn't very friendly to the user of the API, because now they have to study which version of the method to use, and they will get poor performance if they choose the wrong one.
I can't make ExpensiveInternalObject a value field inside Result (as opposed to a pointer) because doing so would require exposing too many internals in the public header file.
Is there a way to make the first function, compute(), know whether its return value is going to become an rvalue reference or whether it is going to become an lvalue, and have different behavior for each case?
You can achieve the syntax you asked for using a kind of proxy object.
Instead of a Result, Result::compute could return an object that represents a promise of a Result. This Promise object could have a conversion operator that implicitly converts to a Result so that "Example B" still works as before. But the promise could also have its own toString(), toInteger(), ... member functions for "Example A":
class Result {
public:
class Promise {
private:
// args
public:
std::string toString() const {
auto result = ExpensiveInternalObject(/* args */);
return result.toString();
}
operator Result() {
Result result;
result.fData = new ExpensiveInternalObject(/* args */);
return result;
}
};
// ...
};
Live demo.
This approach has its downsides though. For example, what if, instead you wrote:
auto result = Result::compute(/*...*/);
std::string resultString = result.toString();
int32_t resultInteger = result.toInteger();
result is now not of Result type but actually a Result::Promise and you end up computing ExpensiveInternalObject twice! You can at least make this to fail to compile by adding an rvalue reference qualifier to the toString(), toInteger(), ... member functions on Result::Promise but it is not ideal.
Considering you can't overload a function by its return type, and you wanted to avoid making two different versions of compute(), the only thing I can think of is setting a flag in the copy constructor of Result. This could work with your particular example, but not in general. For example, it won't work if you're taking a reference, which you can't disallow.

Using RTTI hash with a template function

I understand that templates are compile-time, and typeinfo-related are runtime, but I'm wondering if I can achieve my particular task.
I have a factory method using templates to create objects of a particular type; I also have a preloader (reading data from disk), which determines what type of object is to be created, but doesn't actually create it - that's the responsibility of the creator, and is executed on demand.
void Creator::Spawn(Preloader* pl)
{
std::unordered_map<size_t, std::type_index> hashmap;
// assume ObjectType is simply a wrapper around a hash
hashmap[ObjectType<Type1>::GetType().Hash()] = typeid(Type1);
hashmap[ObjectType<Type2>::GetType().Hash()] = typeid(Type2);
for ( auto& const i : pl->GetPreloadInfo() )
{
size_t hash = i->type_hash.Hash();
// similar-to-desired usage
FactoryCreate<hashmap[hash]>();
}
}
Is there any way to achieve this? Obviously I can do manual checks for each, like below, but it's nasty at best.
// poor, manual implementation
if ( hash == ObjectType<Type1>::GetType().Hash() )
FactoryCreate<Type1>();
else if ( hash == ObjectType<Type2>::GetType().Hash() )
FactoryCreate<Type2>();
Everything I've tried so far has hit the runtime vs compile-time differences, though I'm definitely not aware of all the newest C++11 tricks that may assist (C++14 not usable).
Partially related question here: Use data type (class type) as key in a map
Assuming the hash part is set in stone, you can create a map from those type hashes to your factory functions directly:
using map_type = std::unordered_map<size_t, std::function<void()>>;
template <class... Ts>
map_type create_hash_map() {
map_type map;
// emplace (hash<T>, FactoryCreate<T>) for each T
using swallow = int[];
(void)swallow{0,
(void(
map.emplace(ObjectType<Ts>::GetType().Hash(),
[]{ FactoryCreate<Ts>(); }
)
), 0)...
};
return map;
}
Then we can just use that map directly:
void Creator::Spawn(Preloader* pl)
{
static auto hashmap = create_hash_map<Type1, Type2>();
for ( auto& const i : pl->GetPreloadInfo() )
{
size_t hash = i->type_hash.Hash();
hashmap[hash]();
}
}
That doesn't have error-checking, so if the hash isn't actually in the map, you'll get a bad_function_call exception from std::function. If you need error checking, you can instead do a lookup in the map first:
auto it = hashmap.find(hash);
if (it != hashmap.end()) {
(it->second)();
}
else {
// error!
}