The Boost documentation doesn't elaborate much, but there is an (optional) KeyCompare function that can be passed to the ptree.
Anyone have a good example of using a custom KeyCompare function?
I have recently been working with a ptree that is real slow. My keys are long strings (paths), and I assuming it's the string comparisons that make it slow.
From what I can glean, the default KeyCompare is std::less(), I want to change this. I think something that just compares the hashes of the two strings.
It goes without saying (but I'll say it anyway) that I would use a different object for the key to facilitate this: Something that has (std::string+hash), rather than just a std::string. The hash would be calculated during construction.
Thanks,
Rik.
Found this from the boost source code: An example of a case-insensitive KeyCompare:
template<class T>
struct less_nocase
{
typedef typename T::value_type Ch;
std::locale m_locale;
inline bool operator()(Ch c1, Ch c2) const
{
return std::toupper(c1, m_locale) < std::toupper(c2, m_locale);
}
inline bool operator()(const T &t1, const T &t2) const
{
return std::lexicographical_compare(t1.begin(), t1.end(),
t2.begin(), t2.end(), *this);
}
};
Then all you need to do is pass it in to the basic_ptree class:
typedef basic_ptree<std::string, std::string,
less_nocase<std::string> > iptree;
Related
I am a beginner of C++, and I code what The Cherno teaches in his 100th video of the C++ series, showing at 16:50. But VS is always giving me error.
If without commenting the const part, VS gives me error c3848. After adding const, VS give me error c2676.
I have checked the way to use std::hash on cppreference, and searched for the error on Google, but get nothing. It's just "a little bit" too hard for a beginner like me.
Below is the code.
#include<iostream>
#include<map>
#include<unordered_map>
struct CityRecord
{
std::string Name;
uint64_t Population;
double Latitude, Longtitude;
};
namespace std {
template<>
struct hash<CityRecord>
{
size_t operator()(const CityRecord& key) //const noexcept
{
return hash<std::string>()(key.Name);
}
};
}
int main()
{
std::unordered_map<CityRecord, uint32_t> foundedMap;
foundedMap[CityRecord{ "London", 500000, 2.4, 9.4 }] = 1850;
uint32_t NewYorkYear = foundedMap[CityRecord{ "NY", 7000000, 2.4, 9.4 }];
}
As a beginner, I just want to know how to use the hash function in this case.
There is a much easier solution, without opening the std namespace and specializing the std::hash
If you look at the definition of the std::unordered_map in the CPP reference here, then you will read:
template<
class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;
It is clear and normal, to hand in template parameters for the key-type and the value-type value. However, if the key-type if a custom type, like in your case, then you need to add additional functionality.
First, you need to add hash functionionality. If you read here about std::hash, then the only function the will be called is the "function call operator".
And this must be a "const" function, which will fix one of your problems.
And of course, you may add this function to your struct. That is completely OK. Please see in the example code below. With taht, you can give your own class as a template parameter for the hash functionality to the std::unordered_map. Cool.
If we look at the next template parameter of the std::unordered_map, then we will find std::equal_to. And if we read about this in cppreference, then we will find the following statement:
Function object for performing comparisons. Unless specialised, invokes operator== on type T.
So, we need to add a comparison operator == to your custom struct to satisfy requirements.
Please note: It is a good approach to encapsulate the methods operating on the data of classes within the class and not define them as free functions. Because only the methods of the class should work on the data of the class. If you would later change something in a class and have a free function doing work on the class members. So, please try to follow that approach.
Then, next, the comparison. So, we define the "operator ==" in your class and then have to compare element by element.
For easing up this task, there is a library function called std::tie. Please see here. This basically creates a std::tuple from the given parameters with the advantage, that all comparison functions are already defined and can be immediately reused.
By following the above described approach, the whole implementation will be much simpler.
Please see the below example code:
#include<iostream>
#include<map>
#include<unordered_map>
#include<tuple>
struct CityRecord
{
std::string Name;
uint64_t Population;
double Latitude, Longtitude;
// For std::equal_to
bool operator == (const CityRecord& cr) const { return std::tie(Name, Population, Latitude, Longtitude) == std::tie(cr.Name, cr.Population, cr.Latitude, cr.Longtitude); }
// For hashing
size_t operator()(const CityRecord& key) const { return std::hash<std::string>{}(key.Name); }
};
int main() {
// Definition of the unordered_map
std::unordered_map<CityRecord, uint32_t, CityRecord> foundedMap;
// Adding data
foundedMap[CityRecord{ "London", 500000, 2.4, 9.4 }] = 1850;
uint32_t NewYorkYear = foundedMap[CityRecord{ "NY", 7000000, 2.4, 9.4 }];
}
You need to make the overloaded operator() for the specialization of hash for CityRecord a const member function as shown below. Additionally, we also need to overload operator== for CityRecord as shown below:
struct CityRecord
{
std::string Name;
uint64_t Population;
double Latitude, Longtitude;
//friend declaration for operator== not needed since we've a struct
};
//implement operator==
bool operator==(const CityRecord &lhs, const CityRecord &rhs)
{
return (lhs.Name == rhs.Name) && (lhs.Population == rhs.Population) && (lhs.Latitude ==rhs.Latitude) && (lhs.Longtitude == rhs.Longtitude);
}
namespace std {
template<>
struct hash<CityRecord>
{
//-----------------------------------------------vvvvv-->added this const
size_t operator()(const CityRecord& key) const
{
return hash<std::string>()(key.Name) ^ hash<uint64_t>()(key.Population) ^ hash<double>()(key.Latitude) ^ hash<double>()(key.Longtitude);
}
};
}
Working demo
Here we use an (unnamed) hash<string> object to generate a hash
code for Name, an object of type hash<uint64_t> to generate a hash from Population, and an object of type hash to generate a hash from Latitute and finally an object of type hash<double> to generate a hash from Longitute. Next, we exclusive OR these results to form an overall hash code for the given CityRecord object.
Note that we defined our hash function to hash all the four data members so that our hash function will be compatible with our definition of operator== for CityRecord.
I have a need to use std::pair<QColor, char> as a key of unordered_map. As for the pair, I know that there is boost functionality that can be used, but what about the color? Is it enough to just provide the hash template in the std namespace? If so, what would be the best attribute of the color to base the hash on to maximize the performance and minimize collisions? My first thought was about simple name(). If so
namespace std {
struct hash<Key>
{
std::size_t operator()(const Key& k) const {
return std::hash<std::string>()(k.name());
}
}
The code above is taken from C++ unordered_map using a custom class type as the key.
What you propose will probably work (although you would have to convert the color name from QString to std::string), I would use the RGBA value of the color directly. It is a bit cheaper than having to go through the QString to std::string construction and hash calculation:
template<>
struct std::hash<QColor>
{
std::size_t operator()(const QColor& c) const noexcept
{
return std::hash<unsigned int>{}(c.rgba());
}
};
According to Qt's documentation, QRgb returned by QColor::rgba() is some type equivalent to unsigned int.
I have a map of structs that holds several named values like this:
struct MyData {
MyType dataA;
std::string dataB;
int dataC;
};
typedef std::pair<std::string, MyData> PairType;
std::map<PairType::first_type, PairType::second_type> dataMap;
This is defined in a header file of a compilation unit that calls a function from a library.
Because the library function does not know about my type definitions, I can't pass dataMap directly.
The function only actually needs the dataA struct member and already knows about MyType, so I could pass a std::map<std::string, MyType> instead.
Whats the most elegant way of cutting just the data I need from the map of structs and save it into a new map with the same keys but only the type and values from dataA?
Preferably for C++0x without usage of boost or other external libraries, but solutions for newer standards are also welcome for educational purposes.
I'm basically looking for an equivalent of Python's
newDict = {key:value.dataA for (key,value) in oldDict.items()}
You can use a ranged based for loop to really easily make a copy. That would look like
std::map<std::string, MyType> my_type_map;
for (const auto& pair : dataMap)
{
my_type_map.emplace(pair.first, pair.second.dataA);
}
If you want this as a single expression, you are going to need something like boost::transform_iterator, either by including that, or writing an iterator yourself.
Given a conversion function (or equivalent lambda)
std::pair<std::string, MyType> convert(PairType& pair){
return { pair.first, pair.second.dataA };
};
You can declare newDict and populate it
/* can be const */ std::map<std::string, MyType> newDict {
boost::make_transform_iterator(oldDict.begin(), convert),
boost::make_transform_iterator(oldDict.end(), convert)
};
Or you can use a view type
auto newDict = boost::copy_range<std::map<std::string, MyType>>(oldDict | std::ranges::views::transform(convert));
I would like to implement a class wrapper for database.
Currently, I'm working on a createTable function.
The way I have tried to make it work is, that the user specifies the types
as a template parameters, and the column names as an initialiser list,
this is the template of the function:
template <typename ... Ts>
bool createTable(const std::string & tableName, const std::initializer_list<std::string> & columnNames);
And this is the body of the method:
template<typename ... Ts>
bool DatabaseConnection::createTable(const std::string &tableName, const std::initializer_list<std::string> & columnNames)
{
constexpr size_t num_cols = sizeof...(Ts);
assert(num_cols == columnNames.size());
auto typetuple = std::tuple<Ts...>();
std::vector<std::tuple<std::string, std::string>> columnNameAndType(num_cols);
auto columnNameIterator = columnNames.begin();
for(unsigned it = 0; it++ < columnNames.size(); it++){
typedef std::tuple_element<it, typetuple>::type c; // non-type template argument is not a constant expression
if(is_same<c, int> ...) //pseudocode
std::string dbtype = "INTEGER"; //pseudocode
}
}
Sadly, the tuple_element line doesn't work, because it's not really a
constant expression.
Now, someone might ask, why I want to call it like this:
createTable<int, std::string>("Users", {"ID", "Name"});
instead of just passing two initialiser lists?
Well I just want to distance the user from the interface - If I were able to determine
the it-h type I could just use something like decltype or is_same to determine the type used in database creation query - the user just says what type he/she wants and the Database class
determines the best database type to match the user's request.
Now, it could still be made with initaliser lists, but it wouldn't be compile time, and
I'm just curious to see if it's possible at comple time.
I hope my explanation of the problem is sufficient.
Of course this is mostly a theoretical problem, but I think many people
would be interested in such a syntax, and I haven't found any solutions on the internet yet.
This interface is certainly possible.
A for loop isn't going to do it, because one statement/variable/expression/etc. can't have different types on different evaluations of a for substatement. The loop will need to be via pack expansion instead.
One or more private helper member functions could help for this. It would be possible to get it all in one function definition using a generic lambda, but a little unpleasant.
// private static
template <typename T>
std::string DatabaseConnection::dbTypeName()
{
if constexpr (std::is_same_v<T, int>)
return "INTEGER";
// ...
else
static_assert(!std::is_same_v<T,T>, "Unsupported type argument");
}
template<typename ... Ts>
bool DatabaseConnection::createTable(
const std::string &tableName,
std::initializer_list<std::string> columnNames)
{
constexpr size_t num_cols = sizeof...(Ts);
assert(num_cols == columnNames.size());
std::vector<std::tuple<std::string, std::string>> columnNameAndType;
auto columnNameIterator = columnNames.begin();
(columnNameAndType.emplace_back(*columnNameIterator++, dbTypeName<Ts>()), ...);
// ...
}
Suppose I am using std::unordered_map<std::string, Foo> in my code. It's nice and convenient, but unfortunately every time I want to do a lookup (find()) in this map I have to come up with an instance of std::string.
For instance, let's say I'm tokenizing some other string and want to call find() on every token. This forces me to construct an std::string around every token before looking it up, which requires an allocator (std::allocator, which amounts to a CRT malloc()). This can easily be slower than the actual lookup itself. It also contends with other threads since heap management requires some form of synchronization.
A few years ago I found the Boost.intrusive library; it was just a beta version back then. The interesting thing was it had a container called boost::intrusive::iunordered_set which allowed code to perform lookups with any user-supplied type.
I'll explain it how I'd like it to work:
struct immutable_string
{
const char *pf, *pl;
struct equals
{
bool operator()(const string& left, immutable_string& right) const
{
if (left.length() != right.pl - right.pf)
return false;
return std::equals(right.pf, right.pl, left.begin());
}
};
struct hasher
{
size_t operator()(const immutable_string& s) const
{
return boost::hash_range(s.pf, s.pl);
}
};
};
struct string_hasher
{
size_t operator()(const std::string& s) const
{
return boost::hash_range(s.begin(), s.end());
}
};
std::unordered_map<std::string, Foo, string_hasher> m;
m["abc"] = Foo(123);
immutable_string token; // token refers to a substring inside some other string
auto it = m.find(token, immutable_string::equals(), immutable_string::hasher());
Another thing would be to speed up the "find and insert if not found" use caseāthe trick with lower_bound() only works for ordered containers. The intrusive container has methods called insert_check() and insert_commit(), but that's for a separate topic I guess.
Turns out boost::unordered_map (as of 1.42) has a find overload that takes CompatibleKey, CompatibleHash, CompatiblePredicate types, so it can do exactly what I asked for here.
When it comes to lexing, I personally use two simple tricks:
I use StringRef (similar to LLVM's) which just wraps a char const* and a size_t and provides string-like operations (only const operations, obviously)
I pool the encountered strings using a bump allocator (using lumps of say 4K)
The two combined is quite efficient, though one need understand that all StringRef that point into the pool are obviously invalidated as soon as the pool is destroyed.