Note: c++98
I am a little new to C++ and I want to clean my code up. I have an if statment that checks the data type in an array, and if it matches then it is to execute the corresponding statement.
I want to convert this multi-line if statment to a single line that checks if any of these types exist in the map, and if they do execute it.
My code:
if (boost::iequals(sqlBufferTypes[i][j], "INTEGER") ||
boost::iequals(sqlBufferTypes[i][j], "INT") ||
boost::iequals(sqlBufferTypes[i][j], "BIGINT") ||
boost::iequals(sqlBufferTypes[i][j], "uint8_t") ||
boost::iequals(sqlBufferTypes[i][j], "uint16_t") ||
boost::iequals(sqlBufferTypes[i][j], "LONG"))
{
// do stuff
}
and would like to convert it similar to something like:
map<int, string> dataTypes;
dataTypes[1,"INT"];
dataTypes[2,"BIGINT"];
dataTypes[3,"uint8_t"];
dataTypes[4,"uint16_t"];
dataTypes[5,"LONG"];
if (boost::iequals(dataTypes.begin(), dataTypes.end())
{
// do stuff
}
I suppose the real challenge is to have a map<> that compares the keys case-insensitively.
You do that by using a comparison predicate:
struct ci_less {
bool operator()(std::string_view a, std::string_view b) const {
return boost::lexicographical_compare(a, b, boost::is_iless{});
}
};
You declare the map to use that predicate:
std::map<std::string, int, ci_less> const dataTypes {
{ "INT", 1 },
{ "BIGINT", 2 },
{ "uint8_t", 3 },
{ "uint16_t", 4 },
{ "LONG", 5 },
};
Note that it is now const, and I flipped the key/value pairs. See below
Some tests: Live On Coliru
// your sqlBufferTypes[i][j] e.g.:
for (std::string const key : { "uint32_t", "long", "lONg" }) {
if (auto match = dataTypes.find(key); match != dataTypes.end()) {
std::cout << std::quoted(key) << " maps to " << match->second;
// more readable repeats lookup:
std::cout << " or the same: " << dataTypes.at(key) << "\n"; // throws unless found
} else {
std::cout << std::quoted(key) << " not found\n";
}
}
Prints
"uint32_t" not found
"long" maps to 5 or the same: 5
"lONg" maps to 5 or the same: 5
Flipping Key/Value
Dictionaries have a key field for lookup in all languages/libraries. So, to lookup in reverse you end up doing a linear search (just look at each element).
In Boost you can have your cake and eat it by defining a Multi Index Container.
Multi Index
This can facilitate lookup by multiple indices, including composite keys. (Search my answers for more real-life examples)
Live On Coliru
#include <boost/algorithm/string.hpp> // for is_iless et al.
#include <string_view>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <iostream> // for std::cout
#include <iomanip> // for std::quoted
#include <boost/locale.hpp>
namespace bmi = boost::multi_index;
struct ci_less {
bool operator()(std::string_view a, std::string_view b) const {
return boost::lexicographical_compare(a, b, boost::is_iless{});
}
};
struct DbType {
std::string_view name;
int type_id;
friend std::ostream& operator<<(std::ostream& os, DbType const& t) {
return os << "DbType{" << std::quoted(t.name) << ", " << t.type_id << "}";
}
};
using Map = bmi::multi_index_container<
DbType,
bmi::indexed_by<
bmi::ordered_unique<
bmi::tag<struct by_id>,
bmi::member<DbType, int, &DbType::type_id> >,
bmi::ordered_unique<
bmi::tag<struct by_name>,
bmi::member<DbType, std::string_view, &DbType::name>, ci_less>
>
>;
int main() {
Map dataTypes {
{ "INT", 1 },
{ "BIGINT", 2 },
{ "uint8_t", 3 },
{ "uint16_t", 4 },
{ "LONG", 5 },
};
auto& idx = dataTypes.get<by_name>();
// your sqlBufferTypes[i][j] e.g.:
for (std::string_view const key : { "uint32_t", "long", "lONg" }) {
if (auto match = idx.find(key); match != idx.end()) {
std::cout << std::quoted(key) << " -> " << *match << std::endl;
} else {
std::cout << std::quoted(key) << " not found\n";
}
}
}
Prints
"uint32_t" not found
"long" -> DbType{"LONG", 5}
"lONg" -> DbType{"LONG", 5}
Bimaps
Boost Bimap is a specialization of that for maps. It has fewer options, and notably adds operator[] style interface back.
using Map = boost::bimap<
int,
boost::bimaps::set_of<std::string_view, ci_less>>;
Sadly the constructor doesn't support initializaer lists, but we can use the iterator interface, and then we use the right view of the bimap to do lookups by name:
Live On Coliru
static const Map::relation s_mappings[] = {
{ 1, "INT" },
{ 2, "BIGINT" },
{ 3, "uint8_t" },
{ 4, "uint16_t" },
{ 5, "LONG" },
};
Map const dataTypes { std::begin(s_mappings), std::end(s_mappings) };
// your sqlBufferTypes[i][j] e.g.:
auto& vw = dataTypes.right;
for (std::string_view const key : { "uint32_t", "long", "lONg" }) {
if (auto match = vw.find(key); match != vw.end()) {
std::cout << std::quoted(key) << " -> " << match->second << "\n";
} else {
std::cout << std::quoted(key) << " not found\n";
}
}
Prints
"uint32_t" not found
"long" -> 5
"lONg" -> 5
I think you switched around the key and id in your example, judging by your description.
However, you can do what you want fairly easily by introducing a custom compare for the map. An easy way is to create a lambda function first:
auto compare = [](const std::string& a, const std::string& b) {return boost::iequals(a, b); };
std::map < std::string, int, decltype(compare)> mymap = { {"INT", 1},{"BIGINT", 2 }};//and so on
Then you can do:
if (mymap.count(sqlBufferTypes[i][j])) {//in c++20 can use "contains" instead of "count"
//do stuff
}
If all you want do is check and dont need the ID later, then you should use a std::set instead of map.
c++98 NOTE
In c++98 you cannot use a lambda, however you can still make the compare predicate with a regular struct:
struct compare : public std::binary_function<string, string, bool>
{
bool operator()(const string& a, const string& b) const
{
return boost::iequals(a,b);
}
};
Related
I have some global variables that will be assigned a value once the configuration file is read.
bool bar1;
int bar2;
string bar3;
I read the configuration file which looks like below:
foo1 = 12
foo2 = 0
foo3 = 1
...
void func()
{
//read file into a std::map mp
for(auto i:mp)
{
if(i.first=="foo1")
bar1 = i.second;
else if(i.first=="foo2")
bar2 = i.second;
else if(i.first=="foo3")
bar3 = i.second;
.....
}
}
I have a lot of such variables to initialize from a file. Is there a better way to do this because this will bloat my function.
PS:I am still stuck with C++03.
In my comment, I elaborated a bit on the idea of Jabberwocky to use a std::map.
Actually, we do similar things in our S/W for configuration and similar things. The only difference – we don't use a std::map for this but a pre-defined array. (I didn't like the idea that something has to be done at run-time which actually never changes after compiling.) To demonstrate the concept I made a little MCVE:
#include <iostream>
#include <cassert>
#include <cstring>
#include <algorithm>
#include <map>
int main()
{
// variables
int bar1 = 0, bar2 = 0, bar3 = 0;
// symbol table
const struct Entry { const char *key; int *pVar; } table[] = {
{ "foo1", &bar1 },
{ "foo2", &bar2 },
{ "foo3", &bar3 }
};
const size_t nTable = sizeof table / sizeof *table;
// check that table has correct order
assert([&]()
{
for (size_t i = 1; i < nTable; ++i) {
if (strcmp(table[i - 1].key, table[i].key) >= 0) return false;
}
return true;
}());
// use table in tests
std::pair<const char*, int> mp[] = {
{ "foo1", 123 },
{ "foo2", 234 },
{ "foo3", 345 },
{ "wrong", 666 }
};
// evaluate mp of OP
for (auto i : mp) {
const Entry e = { i.first, 0 };
const auto iter
= std::lower_bound(std::begin(table), std::end(table), e,
[](const Entry &e1, const Entry &e2) { return strcmp(e1.key, e2.key) < 0; });
if (iter != std::end(table) && strcmp(iter->key, i.first) == 0) *iter->pVar = i.second;
else std::cerr << "Unknown var '" << i.first << "'!\n";
}
// print result
std::cout
<< "bar1: " << bar1 << '\n'
<< "bar2: " << bar2 << '\n'
<< "bar3: " << bar3 << '\n';
// done
return 0;
}
Output:
Unknown var 'wrong'!
bar1: 123
bar2: 234
bar3: 345
Live Demo on coliru
The essential part is the struct Entry which groups the name of an option with the address of the corresponding variable. This could be used to store pairs of names and variable addresses in a std::map.
I used instead a pre-sorted array. (Sorting the keys manually in programming is not that difficult – in case of accidents the assert() will alert.)
In our productive S/W, we didn't use addresses of variables but method pointers to setter functions as the destination variables have varying types and the values (provide as string) are subject of a resp. parsing. However, these method pointers are compile-time solvable → the whole table can be static. Hence, the effort for building up the table for each function call is prevented. In this demo, the table stores addresses to local variables. This let me feel that a static table could be a bad idea (and I even didn't try it).
Upon request, here another demonstration using method pointers to setter methods:
#include <iostream>
#include <cassert>
#include <cstring>
#include <string>
#include <algorithm>
class Object {
private:
// some member variables:
int var1, var2;
std::string var3;
double var4;
public:
Object(): var1(), var2(), var4() { }
friend std::ostream& operator<<(std::ostream &out, const Object &obj);
// the setter methods
void setVar1(const char *value) { var1 = atoi(value); }
void setVar2(const char *value) { var2 = atoi(value); }
void setVar3(const char *value) { var3 = value; }
void setVar4(const char *value) { var4 = strtod(value, nullptr); }
// the config method to set value by text
void config(const char *key, const char *value)
{
// symbol table
static const struct Entry {
const char *key; // the symbol
void (Object::*set)(const char*); // the corresponding setter method
} table[] = {
{ "var1", &Object::setVar1 },
{ "var2", &Object::setVar2 },
{ "var3", &Object::setVar3 },
{ "var4", &Object::setVar4 }
};
enum { nTable = sizeof table / sizeof *table };
// check that table has correct order (paranoid - debug only code)
assert([&]()
{
for (size_t i = 1; i < nTable; ++i) {
if (strcmp(table[i - 1].key, table[i].key) >= 0) return false;
}
return true;
}());
// find setter by key
const Entry e = { key, nullptr };
const auto iter
= std::lower_bound(std::begin(table), std::end(table), e,
[](const Entry &e1, const Entry &e2) { return strcmp(e1.key, e2.key) < 0; });
if (iter != std::end(table) && strcmp(iter->key, key) == 0) {
(this->*iter->set)(value);
} else std::cerr << "Unknown var '" << key << "'!\n";
}
};
std::ostream& operator<<(std::ostream &out, const Object &obj)
{
return out
<< "var1: " << obj.var1 << ", var2: " << obj.var2
<< ", var3: '" << obj.var3 << "', var4: " << obj.var4;
}
int main()
{
Object obj;
// print obj before config:
std::cout << "obj: " << obj << '\n';
// configure obj
std::pair<const char*, const char*> config[] = {
{ "var1", "123" },
{ "var2", "456" },
{ "var3", "text" },
{ "var4", "1.23" },
{ "evil", "666" }
};
for (const auto& entry : config) {
obj.config(entry.first, entry.second);
}
// print obj after config:
std::cout << "obj: " << obj << '\n';
// done
return 0;
}
Output:
obj: var1: 0, var2: 0, var3: '', var4: 0
Unknown var 'evil'!
obj: var1: 123, var2: 456, var3: 'text', var4: 1.23
The contents of table (in Object::config()) is static const and will be built at compile-time (and hopefully "burnt" into the binary). Hence, the multiple calls of Object::config() have the only effort of binary search of the matching key and calling the setter in case of success.
A essential pre-condition is that all setter methods have the same signature. Otherwise, storing them in an array wouldn't be possible as they all have to be compatible to the method pointer element in the array.
Live Demo on coliru
In some words: how can I pass various fields from a custom class to a single function?
Now in details:
I have a std::vector containing a class, for example CustomClass from which I have to extract a result from a field from this class by some criteria which are fields in this class and to combine somehow this data.
My first approach to this problem was to use a function which accepts as a parameter the std::vector of the class in order to extract the data and return a std:map. The key in this map is the type of the criteria by which the data should be combined and the value is an int with the combined data from all members of this vector.
The problem is that the criteria is not only one - more than one field from this class may be used as criteria (let for easiness all of the criteria are std::string, if they are not - I could make the function templated).
The easiest way for me now is to make dozens of functions with almost identical code and each of them to extract a simple concrete field from this class. However changes might require similar changes to all of the dozens of functions which would be a maintenance headache. But in this stage I cannot think how to pass to a single function a field from this class...
Here's an example code from this class:
// this is the class with data and criteria
class CustomClass
{
public:
std::string criteria1;
std::string criteria2;
std::string criteria3;
//... and others criteria
int dataToBeCombined;
// other code
};
// this is one of these functions
std::map<std::string, int> getDataByCriteria1(std::vector<CustomClass> aVector)
{
std::map<std::string, int> result;
foreach(CustomClass anObject in aVector)
{
if(result.find(anObject.criteria1)==result.end()) // if such of key doesn't exists
{
result.insert(std::make_pair(anObject.criteria1, anObject.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
}
}
return result;
}
and by similar way I should make the other functions which should work with CustomClass::criteria2, CustomClass::criteria3, etc.
I thought to make these criteria in a single array and to pass to this function only the number of the criteria but the class will be used by others for other purposes and the fields must be easy to read, so this will not be an option (i.e. the real names are not criteria1, criteria2, etc. but are descriptive).
Anyone with ideas?
EDIT: Someone referred my question to "C++ same function parameters with different return type" which obviously is very different - the function in my case return the same type every time, just the parameters it takes must be various fields from a class.
You can use pointer to member. Declare an argument std::string CustomClass::*pField in your function, pass it with &CustomClass::criteriaN, access it with anObject.*pField.
See more on the topic: Pointers to data members.
If all "criteria" are of the same type, I don't see an elegant solution but you can "enumerate" they in some way and use their number.
By example, you can declare a templated getVal() method in CustomClass in this way
template <int I>
const std::string & getVal () const;
and implement they, number by number, criteria by criteria, in this way (outside the body of the class)
template <>
const std::string & CustomClass::getVal<1> () const
{ return criteria1; }
template <>
const std::string & CustomClass::getVal<2> () const
{ return criteria2; }
template <>
const std::string & CustomClass::getVal<3> () const
{ return criteria3; }
Now, you can transform getDataByCriteria1() in a templated function getDataByCriteria() in this way
template <int I>
std::map<std::string, int> getDataByCriteria (std::vector<CustomClass> aVector)
{
std::map<std::string, int> result;
for (const auto & cc : aVector)
{
if ( result.find(cc.getVal<I>()) == result.end()) // if such of key doesn't exists
{
result.insert(std::make_pair(cc.getVal<I>(), cc.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
}
}
return result;
}
and call it in this way
auto map1 = getDataByCriteria<1>(ccVec);
auto map2 = getDataByCriteria<2>(ccVec);
auto map3 = getDataByCriteria<3>(ccVec);
--- EDIT: added solution (C++14 only) for different types criteria ---
A little different if the "criteria" are of different types.
The solution work but in C++14, thanks to auto and decltype().
By example, if
std::string criteria1;
int criteria2;
long criteria3;
You can declare getVal() with auto
template <int I>
const auto & getVal () const;
and define (with auto) all versions of getVal()
template <>
const auto & CustomClass::getVal<1> () const
{ return criteria1; }
template <>
const auto & CustomClass::getVal<2> () const
{ return criteria2; }
template <>
const auto & CustomClass::getVal<3> () const
{ return criteria3; }
and combining auto with decltype(), you can modify getDataByCriteria() in this way
template <int I>
auto getDataByCriteria (std::vector<CustomClass> aVector)
{
std::map<decltype(aVector[0].getVal<I>()), int> result;
for (const auto & cc : aVector)
{
if ( result.find(cc.getVal<I>()) == result.end()) // if such of key doesn't exists
{
result.insert(std::make_pair(cc.getVal<I>(), cc.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
}
}
return result;
}
The use of the function remain the same (thanks to auto again)
auto map1 = getDataByCriteria<1>(ccVec);
auto map2 = getDataByCriteria<2>(ccVec);
auto map3 = getDataByCriteria<3>(ccVec);
p.s.: caution: code not tested
p.s.2 : sorry for my bad English
You can use a function to extract a filed such as
std::string extractFiled(const CustomClass &object, int which) {
switch (which) {
case 1:
return object.criteria1;
case 2:
return object.criteria2;
case 3:
return object.criteria3;
default:
return object.criteria1;
}
}
and getDataByCriteria add an arg to indicate which filed to use.
Or you can just use macro to implement getDataByCriteria.
You tagged it C++11, so use variadic templates.
class VariadicTest
{
public:
VariadicTest()
{
std::map<std::string, int> test1 = getDataByCriteria(testValues, criteria1);
std::map<std::string, int> test2 = getDataByCriteria(testValues, criteria2);
std::map<std::string, int> test3 = getDataByCriteria(testValues, criteria1, criteria2);
std::map<std::string, int> test4 = getDataByCriteria(testValues, criteria1, criteria3);
}
private:
std::string criteria1 = { "Hello" };
std::string criteria2 = { "world" };
std::string criteria3 = { "." };
std::vector<CustomClass> testValues = { {"Hello",1}, {"world",2},{ "!",3 } };
template<typename T> std::map<std::string, int> getDataByCriteria(std::vector<CustomClass> values, T criteria)
{
std::map<std::string, int> result;
//do whatever is needed here to filter values
for (auto v : values)
{
if (v.identifier == criteria)
{
result[values[0].identifier] = values[0].value;
}
}
return result;
}
template<typename T, typename... Args> std::map<std::string, int> getDataByCriteria(std::vector<CustomClass> values, T firstCriteria, Args... args)
{
std::map<std::string, int> result = getDataByCriteria(values, firstCriteria);
std::map<std::string, int> trailer = getDataByCriteria(values, args...);
result.insert(trailer.begin(), trailer.end());
return result;
}
};
You do not specify the actual operations to be done under the various conditions of the criteria being met so it is hard to say how much they actually can be combined.
Here is a possible solution using the std::accumulate() of the STL along with some additional functionality. This example was compiled with Visual Studio 2015.
This approach would make sense if most of the functionality can be combined into a reasonably small accumulation function because most of the criteria are handled in the same way. Or you could have the accumulate_op() function call other functions for specific cases while handling the general case itself.
You might take this as a beginning and make the appropriate modifications.
One such modification may be to get rid of the use of std::map to maintain state. Since using this approach you would iterate through the std::vector doing the accumulation based on the criteria, I am not sure you would even need to use std::map to remember anything if you are accumulating as you go.
// map_fold.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <numeric>
// this is the class with data and criteria
class CustomClass
{
public:
CustomClass() : dataToBeCombined(0) {}
std::string criteria1;
std::string criteria2;
std::string criteria3;
//... and others criteria
int dataToBeCombined;
// other code
};
// This is the class that will contain the results as we accumulate across the
// vector of CustomClass items.
class Criteria_Result {
public:
Criteria_Result() : dataToBeCombined(0) {}
CustomClass myCriteria;
std::map<std::string, int> result1;
std::map<std::string, int> result2;
std::map<std::string, int> result3;
int dataToBeCombined;
};
// This is the accumulation function we provide to std::accumulate().
// This function will build our results.
class accumulate_op {
public:
Criteria_Result * operator ()(Criteria_Result * x, CustomClass &item);
};
Criteria_Result * accumulate_op::operator ()(Criteria_Result *result, CustomClass &item)
{
if (!result->myCriteria.criteria1.empty() && !item.criteria1.empty()) {
std::map<std::string, int>::iterator it1 = result->result1.find(item.criteria1);
if (it1 == result->result1.end()) // if such of key doesn't exists
{
result->result1.insert(std::make_pair(item.criteria1, item.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
it1->second += item.dataToBeCombined;
}
result->dataToBeCombined += item.dataToBeCombined;
}
if (!result->myCriteria.criteria2.empty() && !item.criteria2.empty()) {
std::map<std::string, int>::iterator it2 = result->result2.find(item.criteria2);
if (it2 == result->result2.end()) // if such of key doesn't exists
{
result->result2.insert(std::make_pair(item.criteria2, item.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
it2->second += item.dataToBeCombined;
}
result->dataToBeCombined += item.dataToBeCombined;
}
if (!result->myCriteria.criteria3.empty() && !item.criteria3.empty()) {
std::map<std::string, int>::iterator it3 = result->result3.find(item.criteria3);
if (it3 == result->result3.end()) // if such of key doesn't exists
{
result->result3.insert(std::make_pair(item.criteria3, item.dataToBeCombined));
}
else
{
// do some other stuff in order to combine data
it3->second += item.dataToBeCombined;
}
result->dataToBeCombined += item.dataToBeCombined;
}
return result;
}
int main()
{
Criteria_Result result;
std::vector<CustomClass> aVector;
// set up the criteria for the search
result.myCriteria.criteria1 = "string1";
result.myCriteria.criteria2 = "string2";
for (int i = 0; i < 10; i++) {
CustomClass xx;
xx.dataToBeCombined = i;
if (i % 2) {
xx.criteria1 = "string";
}
else {
xx.criteria1 = "string1";
}
if (i % 3) {
xx.criteria2 = "string";
}
else {
xx.criteria2 = "string2";
}
aVector.push_back (xx);
}
// fold the vector into our results.
std::accumulate (aVector.begin(), aVector.end(), &result, accumulate_op());
std::cout << "Total Data to be combined " << result.dataToBeCombined << std::endl;
std::cout << " result1 list " << std::endl;
for (auto jj : result.result1) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
std::cout << " result2 list " << std::endl;
for (auto jj : result.result2) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
std::cout << " result3 list " << std::endl;
for (auto jj : result.result3) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
std::cout << " Trial two \n\n" << std::endl;
result.myCriteria.criteria2 = "";
result.result1.clear();
result.result2.clear();
result.result3.clear();
result.dataToBeCombined = 0;
// fold the vector into our results.
std::accumulate(aVector.begin(), aVector.end(), &result, accumulate_op());
std::cout << "Total Data to be combined " << result.dataToBeCombined << std::endl;
std::cout << " result1 list " << std::endl;
for (auto jj : result.result1) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
std::cout << " result2 list " << std::endl;
for (auto jj : result.result2) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
std::cout << " result3 list " << std::endl;
for (auto jj : result.result3) {
std::cout << " " << jj.first << " " << jj.second << std::endl;
}
return 0;
}
This produces the output as follows:
Total Data to be combined 90
result1 list
string 25
string1 20
result2 list
string 27
string2 18
result3 list
Trial two
Total Data to be combined 45
result1 list
string 25
string1 20
result2 list
result3 list
I need to get the hash of a value with arbitrary precision (from Boost.Multiprecision); I use the cpp_int backend. I came up with the following code:
boost::multiprecision::cpp_int x0 = 1;
const auto seed = std::hash<std::string>{}(x0.str());
I don't need the code to be as fast as possible, but I find it very clumsy to hash the string representation.
So my question is twofold:
Keeping the arbitrary precision, can I hash the value more efficiently?
Maybe I should not insisting on keeping the arbitrary precision and I should convert to a double which I could hash easily (I would still however make the comparison needed for the hash table using the arbitrary precision value)?
You can (ab)use the serialization support:
Support for serialization comes in two forms:
Classes number, debug_adaptor, logged_adaptor and rational_adaptor have "pass through" serialization support which requires the underlying backend to be serializable.
Backends cpp_int, cpp_bin_float, cpp_dec_float and float128 have full support for Boost.Serialization.
So, let me cobble something together that works with boost and std unordered containers:
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Let's forward the relevant hash<> implementations to our own hash_impl specialization that uses Multiprecision and Serialization:
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
Now, of course, this begs the question, how is hash_impl implemented?
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
This looks pretty simple. That's because Boost is awesome, and writing a hash_sink device for use with Boost Iostreams is just the following straightforward exercise:
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
Full Demo:
Live On Coliru
#include <iostream>
#include <iomanip>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/multiprecision/cpp_int.hpp>
#include <boost/multiprecision/cpp_int/serialize.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/functional/hash.hpp>
namespace mp_hashing {
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
}
#include <unordered_map>
#include <boost/unordered_map.hpp>
namespace std {
template <typename backend>
struct hash<boost::multiprecision::number<backend> >
: mp_hashing::hash_impl<boost::multiprecision::number<backend> >
{};
}
namespace boost {
template <typename backend>
struct hash<multiprecision::number<backend> >
: mp_hashing::hash_impl<multiprecision::number<backend> >
{};
}
template <typename Map>
void test(Map const& map) {
std::cout << "\n" << __PRETTY_FUNCTION__ << "\n";
for(auto& p : map)
std::cout << p.second << "\t" << p.first << "\n";
}
int main() {
using boost::multiprecision::cpp_int;
test(std::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
test(boost::unordered_map<cpp_int, std::string> {
{ cpp_int(1) << 111, "one" },
{ cpp_int(2) << 222, "two" },
{ cpp_int(3) << 333, "three" },
});
}
Prints
void test(const Map&) [with Map = std::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
one 2596148429267413814265248164610048
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
void test(const Map&) [with Map = boost::unordered::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >]
three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776
two 13479973333575319897333507543509815336818572211270286240551805124608
one 2596148429267413814265248164610048
As you can see, the difference in implementation between Boost's and the standard library's unordered_map show up in the different orderings for identical hashes.
Just to say that I've just added native hashing support (for Boost.Hash and std::hash) to git develop. It works for all the number types including those from GMP etc. Unfortunately that code won't be released until Boost-1.62 now.
The answer above that (ab)uses serialization support, is actually extremely cool and really rather clever ;) However, it wouldn't work if you wanted to use a vector-based hasher like CityHash, I added an example of using that by accessing the limbs directly to the docs: https://htmlpreview.github.io/?https://github.com/boostorg/multiprecision/blob/develop/doc/html/boost_multiprecision/tut/hash.html Either direct limb-access or the serialization tip will work with all previous releases of course.
if I uncomment these
//BaseList baselist;
//MemberList memberlist;
outside the loop and comment out the ones inside the loop it crashes. I need to be able to have the baselist (and memberlist) outside any loop. How is this achieved?
Edit
The actual problem I am trying to solve in it's simplest form is this.
I want to have a std::vector of MyClass, call it AllThingsBunchedTogether.
I also want to have a std::vector of BaseList, call it AllThingsSpreadOut.
So
AllThingsBunchedTogether might contain (just the anInt1 part for the sake of compactness): 1,2,1,10,2,3,4,4,5,9,10,10.
AllThingsSpreadOut might contain (zero not used for now) at [1] 1,1 at [2] 2,2 at [3] 3 at [4] 4,4 at [5] 5 at [9] 9 at [10] 10,10,10.
Note that the numbers themselves aren't be stored in the BaseList, but e.g., the MyClass(1, "John").
At [1] it could be "Mike", "John", at [2] it could be "Mike", "Dagobart" at [3]
"John" ... at [10] "John" "Mike" "Dagobart" etc so that there no duplicates in
any of the BaseList at AllThingsSpreadOut[i] since each MyClass in each
BaseList hashes to a different value (anInt1 + Name).
In essence, anInt1 tells where the MyClass lives in AllThingsSpreadOut, but anInt1 + name guarantees uniqueness within each BaseList.
So the idea is that AllThingsSpreadOut is a vector of BaseList where at each BaseList at vector location is a list of similar things.
Then, when I remove things from AllThingsBunchedTogether (not by a clear, but by a search to remove some items like in the code below IsMarkedToDelete), they will automatically disappear from the corresponding AllThingsSpreadOut.
AllThingsSpreadOut acts as a sort for AllThingsBunchedTogether, with intrusive semantics. AllThingsBunchedTogether allows superfast access through [].
End Edit
#include <vector>
#include <iostream>
#include <boost/intrusive/list.hpp>
using namespace boost::intrusive;
class MyClass : public list_base_hook<link_mode<auto_unlink>> // This is a derivation hook
{
public:
std::string name;
bool bIsMarkedToDelete;
int anInt1;
public:
list_member_hook<link_mode<auto_unlink>> member_hook_; // This is a member hook
MyClass(std::string n, int i) : name(n), anInt1(i), bIsMarkedToDelete(false) {}
};
bool IsMarkedToDelete(const MyClass &o)
{
return o.bIsMarkedToDelete;
}
//Define a list that will store MyClass using the public base hook
typedef list<MyClass, constant_time_size<false>> BaseList;
// Define a list that will store MyClass using the public member hook
typedef list<MyClass,
member_hook<MyClass, list_member_hook<link_mode<auto_unlink>>, &MyClass::member_hook_>,
constant_time_size<false> > MemberList;
int main()
{
bool done = false;
std::vector<MyClass> values;
std::string names[] = {"John", "Mike", "Dagobart"};
//BaseList baselist;
//MemberList memberlist;
int i = 0;
while(!done)
{
// Create several MyClass objects, each one with a different value
for (int j = 0; j < 11; ++j)
values.emplace_back(names[j % 3], j);
BaseList baselist;
MemberList memberlist;
// Now insert them in t-he reverse order in the base hook list
for (auto& e : values)
{
baselist.push_front(e);
memberlist.push_back(e);
}
// Now test lists
auto rbit(baselist.rbegin());
auto mit(memberlist.begin());
auto it(values.begin()), itend(values.end());
// Test the objects inserted in the base hook list
for (; it != itend; ++it, ++rbit)
{
if (&*rbit != &*it)
return 1;
}
// Test the objects inserted in the member hook list
for (it = values.begin(); it != itend; ++it, ++mit)
{
if (&*mit != &*it)
return 1;
}
# if 0
for(auto& e : values)
std::cout << e.anInt1 << "\n";
for(auto& e : baselist)
std::cout << e.anInt1 << "\n";
for(auto& e : memberlist)
std::cout << e.anInt1 << "\n";
#endif // 0
if(2 == i)
{
for(auto& e: values)
std::cout << e.name << "\n";
for(auto& e: values)
{
if("Mike" == e.name)
e.bIsMarkedToDelete = true;
}
values.erase(
std::remove_if(values.begin(), values.end(), IsMarkedToDelete), values.end());
}
if(i++ > 3)
{
values.clear();
done = true;
}
std::cout << "\n";
std::cout << values.size() << "\n";
std::cout << baselist.size() << "\n";
std::cout << memberlist.size() << "\n";
}
}
I've seen it late, but anyways, here goes:
What you describe matches exactly the implementation of an intrusive hash table of MyClass elements, where
anInt1 is the hash (the bucket identifier) for an element
the bucket lists are implemented as linked lists
equality is defined as equality of (anInt1, Name)
So really, your program could just be:
Live On Coliru
std::unordered_set<MyClass> values {
{ "John", 0 }, { "Mike", 1 }, { "Dagobart", 2 },
{ "John", 3 }, { "Mike", 4 }, { "Dagobart", 5 },
{ "John", 6 }, { "Mike", 7 }, { "Dagobart", 8 },
{ "John", 9 }, { "Mike", 10 },
};
for(int i = 0; i<=3; ++i) {
if(2 == i) {
for(auto& e: values) std::cout << e.name << " "; std::cout << "\n";
for(auto& e: values) e.bIsMarkedToDelete |= ("Mike" == e.name);
for(auto it=begin(values); it!=end(values);) {
if (it->bIsMarkedToDelete) it = values.erase(it);
else ++it;
}
}
std::cout << "i=" << i << ", values.size(): " << values.size() << "\n";
}
values.clear();
std::cout << "Done\n";
if you really wanted contiguous storage, I can only assume you wanted this for performance
you do not want to use pointers instead of objects, since that simply negates the memory layout ("AllThingsBunchedTogether") benefits and you'd be better of with the unordered_set or unodered_map as above
you do not want to use auto_unlink mode, since it cripples performance (by doing uncontrolled deletion triggers, by inhibiting constant-time size() and by creating thread safety issues)
instead, you should employ the above stratagy, but with boost::intrusive::unordered_set instead see http://www.boost.org/doc/libs/1_57_0/doc/html/intrusive/unordered_set_unordered_multiset.html
Here, again, is a proof-of-concept:
Live On Coliru
#include <vector>
#include <iostream>
#include <boost/intrusive/unordered_set.hpp>
#include <vector>
//#include <functional>
//#include <algorithm>
namespace bic = boost::intrusive;
struct MyClass : bic::unordered_set_base_hook<bic::link_mode<bic::auto_unlink>>
{
std::string name;
int anInt1;
mutable bool bIsMarkedToDelete;
MyClass(std::string name, int i) : name(name), anInt1(i), bIsMarkedToDelete(false) {}
bool operator==(MyClass const& o) const { return anInt1 == o.anInt1 && name == o.name; }
struct hasher { size_t operator()(MyClass const& o) const { return o.anInt1; } };
};
typedef bic::unordered_set<MyClass, bic::hash<MyClass::hasher>, bic::constant_time_size<false> > HashTable;
int main() {
std::vector<MyClass> values {
MyClass { "John", 0 }, MyClass { "Mike", 1 }, MyClass { "Dagobart", 2 },
MyClass { "John", 3 }, MyClass { "Mike", 4 }, MyClass { "Dagobart", 5 },
MyClass { "John", 6 }, MyClass { "Mike", 7 }, MyClass { "Dagobart", 8 },
MyClass { "John", 9 }, MyClass { "Mike", 10 },
};
HashTable::bucket_type buckets[100];
HashTable hashtable(values.begin(), values.end(), HashTable::bucket_traits(buckets, 100));
for(int i = 0; i<=3; ++i) {
if(2 == i) {
for(auto& e: values) std::cout << e.name << " "; std::cout << "\n";
for(auto& e: values) e.bIsMarkedToDelete |= ("Mike" == e.name);
values.erase(std::remove_if(begin(values), end(values), std::mem_fn(&MyClass::bIsMarkedToDelete)));
}
std::cout << "i=" << i << ", values.size(): " << values.size() << "\n";
std::cout << "i=" << i << ", hashtable.size(): " << hashtable.size() << "\n";
}
values.clear();
std::cout << "Done\n";
}
Here's the error message, which you omitted:
Assertion `node_algorithms::inited(to_insert)' failed.
From this we can understand that an element is being inserted twice. This isn't valid with intrusive containers in general.
When you have your lists inside the loop, they are destroyed and recreated each time. But when they are outside, you never clear them, and you also never clear values, so this sequence occurs:
Add 11 elements to values.
Add all values to the lists.
Add 11 elements to values; it still has the previous 11 so now 22 elements.
Add all values to the lists. Crash on the first one, because it is already in a list.
One solution is to add values.clear() at the top of the while(!done) loop.
I need a map which can have two keys, of different data types, yet point to the same struct.
struct DataStruct {
SomeEnum keyEnum; // <---- key as enum
std::string keyString; // <----- a key as a string
int arbitrarydata;
int moredata;
}
Then I want a std::map I can look up like:
std::map<SomeEnum||std::string, DataStruct> dataMap;
dataMap[SomeEnum::AValue] = dataStruct1;
dataMap["mykey"] = dataStruct2;
Is this even possible or do I need to make 2 maps? Seems a waste. Or do I need to overload an operator or something?
You can use std::pair, like this:
#include <iostream>
#include <map>
#include <utility>
typedef enum {A, B, C} en;
int main ()
{
en myen = A;
std::map<std::pair<char,int>, int> mymap;
mymap.insert ( std::pair<std::pair<char, int>,int>(std::make_pair('a',myen),200) );
mymap.insert ( std::pair<std::pair<char, int>,int>(std::make_pair('z',30),400) );
// showing contents:
std::cout << "mymap contains:\n";
for (std::map<std::pair<char,int>, int>::iterator it=mymap.begin(); it!=mymap.end(); ++it)
std::cout << "(" << it->first.first << ", " << it->first.second <<
") => " << it->second << '\n';
return 0;
}
Not an answer in the question:
Note, that in C++11, you can use enum class, which in general can be more useful.
A std::map can only have keys of the same type, but you can trick it with whatever key logic you want. Just be sure that they can compare properly:
struct DataStruct {
struct Key {
std::string keyString;
SomeEnum keyEnum;
int type;
Key(SomeEnum a) : keyEnum(a), type(0) { }
Key(const char * a) : keyString(a), type(1) { }
bool operator<(const Key & o) const {
if (type != o.type) return type < o.type;
else return type == 0 ? keyEnum < o.keyEnum : keyString < o.keyString;
}
};
int data;
}
Then you can use it almost the way you wanted:
std::map<DataStruct::Key, DataStruct> dataMap;
dataMap[SomeEnum::AValue] = dataStruct1;
dataMap["mykey"] = dataStruct2;
You need to be sure that keys of different types don't point to the same data, thats why I first order them by type and then by their value.