Is it possible to get hash values as compile-time constants?

Is it possible to get hash values as compile-time constants? - c++

I thought I'd try selecting different options as strings by hashing them, but this doesn't work:
#include <type_traits>
#include <string>
inline void selectMenuOptionString(const std::string& str)
{
switch (std::hash<std::string>()(str))
{
case std::hash<std::string>()(std::string("Selection one")) : break;
// Expression must have a constant value
}
}
inline void selectMenuOptionString2(const std::string& str)
{
size_t selectionOneHash = std::hash<std::string>()(std::string("Selection one"));
switch (std::hash<std::string>()(str))
{
case selectionOneHash: // Expression must have a constant value
// The variable of selectionOneHash cannot be used as a constant
}
constexpr size_t hash = std::hash<int>()(6); // Expression must have a constant value
}
It seems I can't get hash values at compile time. From what I've read each different input should yield the same unique output every time, with a very low chance of collision. Given these properties couldn't the hash value be calculated at compile time? I don't know much at all about hashing, I usually use an unordered_map, but I wanted to try something new for learning's sake.

std::hash::operator() isn't constexpr, so you can't just use it. Instead, you'd have to write your own constexpr hash function. For example, the following is the FNV-1a hash algorithm (untested):
template <typename Str>
constexpr size_t hashString(const Str& toHash)
{
// For this example, I'm requiring size_t to be 64-bit, but you could
// easily change the offset and prime used to the appropriate ones
// based on sizeof(size_t).
static_assert(sizeof(size_t) == 8);
// FNV-1a 64 bit algorithm
size_t result = 0xcbf29ce484222325; // FNV offset basis
for (char c : toHash) {
result ^= c;
result *= 1099511628211; // FNV prime
}
return result;
}
And then you can use it:
int selectMenuOptionString(const std::string& str)
{
switch (hashString(str))
{
case hashString(std::string_view("Selection one")): return 42;
default: return 0;
}
}
Note that if you wrote hashString("Selection one"), it would actually hash the null terminator as well, so you might want to have an overload to catch string literals, such as:
template <size_t N>
constexpr size_t hashString(char const (&toHash)[N])
{
return hashString(std::string_view(toHash));
}
Demo

You'll need to implement your own hash function, because there's no suitable instantiation of std::hash that's constexpr. Here's a cheap-and-dirty...
EDIT: In order not to be humiliated too badly by Justin's answer, I added a 32 bit branch.
constexpr size_t hash(const char *str) {
static_assert(sizeof(size_t) == 8 || sizeof(size_t) == 4);
size_t h = 0;
if constexpr(sizeof(size_t) == 8) {
h = 1125899906842597L; // prime
} else {
h = 4294967291L;
}
int i = 0;
while (str[i] != 0) {
h = 31 * h + str[i++];
}
return h;
}

I just wanted to add this because I think it's cool. The constexpr strlen I got from a question here: constexpr strlen
#include <iostream>
#include <string>
int constexpr strlength(const char* str)
{
return *str ? 1 + strlength(str + 1) : 0;
}
size_t constexpr Hash(const char *first)
{ // FNV-1a hash function
const size_t FNVoffsetBasis = 14695981039346656037ULL;
const size_t FNVprime = 1099511628211ULL;
const size_t count = strlength(first);
size_t val = FNVoffsetBasis;
for (size_t next = 0; next < count; ++next)
{
val ^= (size_t)first[next];
val *= FNVprime;
}
return val;
}
inline void selectMenuOptionString(const std::string& str)
{
switch (Hash(str.c_str()))
{
case Hash("Selection one"): /*Do something*/ break;
case Hash("Selection two"): /*Do something*/ break;
}
}
int main()
{
static_assert(strlength("Hello") == 5, "String length not equal");
}

You can't get the hash of a runtime value at compile-time, no.
Even if you passed std::hash a constant expression, it is not defined to be able to do its hashing work at compile-time.
As far as I know (which isn't far), you'd have to come up with some monstrous template metahackery (or, worse, macros!) to do this. Personally, if your text input is known at build, I'd just pregenerate a hash outside of the code, perhaps in some Python-driven pre-build step.

Related

str::find() function for the const char*

const char *attribute[] =
{"abc","efg","hij","lmn","opq","rst","uvw","Xyz"};
want to find Boolean and location of the "lmn" in the above array.

This example will show you how to get both a bool and an index back.
Demo here : https://onlinegdb.com/vHHJ9QG1M
#include <array>
#include <limits>
#include <algorithm>
#include <iostream>
#include <string_view>
// make a struct to be able to return two (readable) values from function
// I almost never use std::pair it results in hard to read code.
// where you have to check the semantics of first/second over and over again.
struct is_attribute_result_t
{
// conversion to bool so result can be directly used in "if's"
constexpr operator bool() const
{
return is_attribute;
}
bool is_attribute{false};
std::size_t index{std::numeric_limits<std::size_t>::max()};
};
// make a std::array that is usable at compile time
// use string_view because it implements operator== for the whole string (const char* doesn't)
constexpr std::array<std::string_view,8> attributes = {"abc","efg","hij","lmn","opq","rst","uvw","xyz"};
// make a function that can be evaluated at compile time
// so no std::find, just use availability of (constexpr) operator== on string_view
constexpr is_attribute_result_t test_attribute(const std::string_view& attribute)
{
is_attribute_result_t result;
for(std::size_t n = 0; n < attributes.size(); ++n)
{
if(attribute == attributes[n])
{
result.is_attribute = true;
result.index = n;
return result;
}
}
return result;
}
int main()
{
// nice thing is you can now also check at compile time.
static_assert(test_attribute("abc"));
static_assert(test_attribute("abc").index == 0ul);
static_assert(test_attribute("lmn"));
static_assert(test_attribute("lmn").index == 3ul);
static_assert(!test_attribute("123"));
// and ofcourse still use the function at runtime too
if ( auto result = test_attribute("lmn"))
{
std::cout << "lmn is an attribute, and is found at index = " << result.index << "\n";
}
return 0;
}

Is there a __builtin_constant_p() for Visual C++?

Is there some function like GCC's __builtin_constant_p() for Microsoft Visual Studio? As I understand, the function returns non-zero if the argument is constant, like a string literal.
In the answer here (How to have "constexpr and runtime" alias) is a nice use case of it.
EDIT:
My idea was instead of writing something like:
#include <string.h>
int foo() {
return strlen("text");
}
I could write:
#include <string.h>
// template_strlen() would be a function that gets the length of a compile-time const string via templates
#define STRLEN(a) (__builtin_constant_p(a) ? template_strlen(a) : strlen(a))
int foo() {
return STRLEN("text");
}
(I guess that is about what was written in the linked question.)
All I need for that is a variant of __builtin_constant_p().

Here is an example about how to get compile-time detection of string length (which is not the answer to the initial question but to the second one)
Please notice however that most compiler already replace strlen("bob") by 3 in the very first optimization level, so I doubt it has any use in reality.
template <typename T>
struct StrLenHelper
{
static constexpr size_t len(T) { return 0; }
};
template <size_t sel>
struct StrLenHelper<const char (&)[sel]>
{
static constexpr size_t len(const char (&a)[sel]) { return sel-1; }
};
template <>
struct StrLenHelper<const char*>
{
static size_t len(const char * a) { return strlen(a); }
};
#define StrLen(X) StrLenHelper<decltype(X)>::len(X)
Proof that it works on a recent compiler:
template <size_t A>
struct Test { enum T { value = A }; };
// Outputs "5 5 4" if your program is called "test"
int main(int a, char**b)
{
printf("%u %u %u\n", Test<StrLen("bobby")>::value, StrLen("bobby"), StrLen(b[0]));
return 0;
}
Some strange coding practice will not trigger compile-time behaviour like in constexpr const char * b = "bob";, this will call the run-time version because the type, at the time of call is const char* (constexpr is not a modifier you can select upon in a template, or I don't know how)

In Visual Studio 2012 and Visual Studio 2013 there is the _IS_LITERAL_TYPE macro which makes use of std::is_literal_type, which is documented at http://www.cplusplus.com/reference/type_traits/is_literal_type/.
The following is a relevant excerpt from the documentation of is_literal_type.
"""Trait class that identifies whether T is a literal type.
A literal type is a type that can qualify as constexpr."""
Perhaps this would suffice.
The following excerpt from the documentation for __builtin_constant_p leads me to believe it will.
"You can use the built-in function __builtin_constant_p to determine if a value is known to be constant at compile-time..."
To me the phrases "is a literal type," "constexpr," and "known to be constant at compile-time" have the same meaning. Perhaps I am mistaken.
Then again, I will be the first to admit that I am not certain.

If is_literal_type is not what you want, the following function might be of use. With it I was able to tell the difference between a char string that was defined as follows and one that was allocated on the heap.
LPCTSTR constString = _T("Hello World!");
My implementation of constant_p is as follows.
int constant_p(const void *p)
{
static bool s_init = false;
static ULONGLONG s_TextSegmentStartVirtualAddress = 0;
static ULONGLONG s_TextSegmentEndVirtualAddress = 0;
static ULONGLONG s_RDataSegmentStartVirtualAddress = 0;
static ULONGLONG s_RDataSegmentEndVirtualAddress = 0;
if (! s_init)
{
s_init = true;
PIMAGE_NT_HEADERS pNtHeaders = ::ImageNtHeader(
reinterpret_cast<PVOID>(::GetModuleHandle(NULL)));
if (! pNtHeaders)
{
return 0;
}
ULONGLONG ImageBase = pNtHeaders->OptionalHeader.ImageBase;
PIMAGE_SECTION_HEADER pSectionHeader = (PIMAGE_SECTION_HEADER)(pNtHeaders + 1);
for (WORD i = 0; i < pNtHeaders->FileHeader.NumberOfSections; ++i)
{
char *name = (char*)pSectionHeader->Name;
if (0 == ::strcmp(name, ".text"))
{
s_TextSegmentStartVirtualAddress = ImageBase
+ pSectionHeader->VirtualAddress;
s_TextSegmentEndVirtualAddress = s_TextSegmentStartVirtualAddress
+ pSectionHeader->SizeOfRawData;
}
else if (0 == ::strcmp(name, ".rdata"))
{
s_RDataSegmentStartVirtualAddress = ImageBase
+ pSectionHeader->VirtualAddress;
s_RDataSegmentEndVirtualAddress = s_RDataSegmentStartVirtualAddress
+ pSectionHeader->SizeOfRawData;
}
pSectionHeader++;
}
}
if (0 == s_TextSegmentStartVirtualAddress)
{
// Something went wrong. Give up.
return 0;
}
ULONGLONG test = reinterpret_cast<ULONGLONG>(p);
if (
s_TextSegmentStartVirtualAddress <= test
&& test <= s_TextSegmentEndVirtualAddress
)
{
return 1;
}
else if (
s_RDataSegmentStartVirtualAddress <= test
&& test <= s_RDataSegmentEndVirtualAddress
)
{
return 1;
}
return 0;
}
Note you need to include DbgHelp.h and link with DbgHelp.lib in order for this to work.
I hope one of my proposed solutions works for you. I would like to know.

Condensing a do-while loop to a #define macro

Consider the following sample code (I actually work with longer binary strings but this is enough to explain the problem):
void enumerateAllSubsets(unsigned char d) {
unsigned char n = 0;
do {
cout<<binaryPrint(n)<<",";
} while ( n = (n - d) & d );
}
The function (due to Knuth) effectively loops through all subsets of a binary string;
For example :
33 = '00100001' in binary and enumerateAllSubsets(33) would produce:
00000000, 00100000, 00000001, 00100001.
I need to write a #define which would make
macroEnumerate(n,33)
cout<<binaryPrint(n)<<",";
behave in a way equivalent to enumerateAllSubsets(33). (well, the order might be rearranged)
Basically i need the ability to perform various operations on subsets of a set.
Doing something similar with for-loops is trivial:
for(int i=0;i < a.size();i++)
foo(a[i]);
can be replaced with:
#define foreach(index,container) for(int index=0;index < container.size();index++)
...
foreach(i,a)
foo(a[i]);
The problem with enumerateAllSubsets() is that the loop body needs to be executed once unconditionally and as a result the do-while cannot be rewritten as for.
I know that the problem can be solved by STL-style templated function and a lambda passed to it (similar to STL for_each function), but some badass #define macro seems like a cleaner solution.

Assuming C++11, define a range object:
#include <iostream>
#include <iterator>
#include <cstdlib>
template <typename T>
class Subsets {
public:
Subsets(T d, T n = 0) : d_(d), n_(n) { }
Subsets begin() const { return *this; }
Subsets end() const { return {0, 0}; }
bool operator!=(Subsets const & i) const { return d_ != i.d_ || n_ != i.n_; }
Subsets & operator++() {
if (!(n_ = (n_ - d_) & d_)) d_ = 0;
return *this;
}
T operator*() const { return n_; }
private:
T d_, n_;
};
template <typename T>
inline Subsets<T> make_subsets(T t) { return Subsets<T>(t); }
int main(int /*argc*/, char * argv[]) {
int d = atoi(argv[1]);
for (auto i : make_subsets(d))
std::cout << i << "\n";
}
I've made it quite general in case you want to work with, e.g., uint64_t.

One option would be to use a for loop that always runs at least once, such as this:
for (bool once = true; once? (once = false, true) : (n = (n - d) & d); )
// loop body
On the first iteration, the once variable gets cleared and the expression evaluates to true, so the loop executes. From that point forward, the actual test-and-step logic controls the loop.
From here, rewriting this to a macro should be a lot easier.
Hope this helps!

You can do a multiline macro that uses an expression, like this:
#define macroenum(n, d, expr ) \
n = 0; \
do { \
(expr); \
} while (n = (n -d) & d) \
; \
int main(int argc, const char* argv[])
{
enumerateAllSubsets(33);
int n;
macroenum(n, 33, cout << n << ",");
}
As others have mentioned this will not be considered very clean by many - amongst other things, it relies on the variable 'n' existing in scope. You may need to wrap expr in another set of parens, but I tested it with g++ and got the same output as enumerateAllSubsets.

It seems like your goal is to be able to do something like enumerateAllSubsets but change the action performed for each iteration.
In C++ you can do this with a function in the header file:
template<typename Func>
inline void enumerateAllSubsets(unsigned char d, Func f)
{
unsigned char n = 0;
do { f(n); } while ( n = (n - d) & d );
}
Sample usage:
enumerateAllSubsets(33, [](auto n) { cout << binaryPrint(n) << ','; } );

How construct hash function for a user defined type?

For example, in the following struct:
1) editLine is a pointer to a data line which has CLRF,
2) nDisplayLine is the display line index of this editLine,
3) start is the offset in the display line,
4) len is the length of the text;
struct CacheKey {
const CEditLine* editLine;
int32 nDisplayLine;
int32 start;
int32 len;
friend bool operator==(const CacheKey& item1, const CacheKey& item2) {
return (item1.start == item2.start && item1.len == item2.len && item1.nDisplayLine == item2.nDisplayLine &&
item1.editLine == item2.editLine);
}
CacheKey() {
editLine = NULL;
nDisplayLine = 0;
start = 0;
len = 0;
}
CacheKey(const CEditLine* editLine, int32 dispLine, int32 start, int32 len) :
editLine(editLine), nDisplayLine(dispLine), start(start), len(len)
{
}
int hash() {
return (int)((unsigned char*)editLine - 0x10000) + nDisplayLine * nDisplayLine + start * 2 - len * 1000;
}
};
Now I need to put it into a std::unordered_map<int, CacheItem> cacheMap_
The problem is how to design the hash function for this structure, is there any guidelines?
How could i make sure the hash function is collision-free?

To create a hash function, you can use std::hash, which is defined for integers. Then, you can combine them "as the boost guys does" (because doing a good hash is something non trivial) as explained here : http://en.cppreference.com/w/cpp/utility/hash.
Here is a hash_combine method :
inline void hash_combine(std::size_t& seed, std::size_t v)
{
seed ^= v + 0x9e3779b9 + (seed << 6) + (seed >> 2);
}
So the "guideline" is more or less what's is shown on cppreference.
You CAN'T be sure your hash function is colision free. Colision free means that you do not loose data (or you restrict yourself to a small set of possibilities for your class). If any int32 value is allowed for each fields, a collision free hash is a monstrously big index, and it won't fit in a small table. Let unordered_map take care of collisions, and combine std::hash hash as explained above.
In you case, it will look something like
std::size_t hash() const
{
std::size_t h1 = std::hash<CEditLine*>()(editLine);
//Your int32 type is probably a typedef of a hashable type. Otherwise,
// you'll have to static_cast<> it to a type supported by std::hash.
std::size_t h2 = std::hash<int32>()(nDisplayLine);
std::size_t h3 = std::hash<int32>()(start);
std::size_t h4 = std::hash<int32>()(len);
std::size_t hash = 0;
hash_combine(hash, h1);
hash_combine(hash, h2);
hash_combine(hash, h3);
hash_combine(hash, h4);
return hash;
}
Then, you can specialize the std::hash operator for your class.
namespace std
{
template<>
struct hash<CacheKey>
{
public:
std::size_t operator()(CacheKey const& s) const
{
return s.hash();
}
};
}

Efficient way to convert int to string

I'm creating a game in which I have a main loop. During one cycle of this loop, I have to convert int value to string about ~50-100 times. So far I've been using this function:
std::string Util::intToString(int val)
{
std::ostringstream s;
s << val;
return s.str();
}
But it doesn't seem to be quite efficient as I've encountered FPS drop from ~120 (without using this function) to ~95 (while using it).
Is there any other way to convert int to string that would be much more efficient than my function?

It's 1-72 range. I don't have to deal with negatives.
Pre-create an array/vector of 73 string objects, and use an index to get your string. Returning a const reference will let you save on allocations/deallocations, too:
// Initialize smallNumbers to strings "0", "1", "2", ...
static vector<string> smallNumbers;
const string& smallIntToString(unsigned int val) {
return smallNumbers[val < smallNumbers.size() ? val : 0];
}

The standard std::to_string function might be a useful.
However, in this case I'm wondering if maybe it's not the copying of the string when returning it might be as big a bottleneck? If so you could pass the destination string as a reference argument to the function instead. However, if you have std::to_string then the compiler probably is C++11 compatible and can use move semantics instead of copying.

Yep — fall back on functions from C, as explored in this previous answer:
namespace boost {
template<>
inline std::string lexical_cast(const int& arg)
{
char buffer[65]; // large enough for arg < 2^200
ltoa( arg, buffer, 10 );
return std::string( buffer ); // RVO will take place here
}
}//namespace boost
In theory, this new specialisation will take effect throughout the rest of the Translation Unit in which you defined it. ltoa is much faster (despite being non-standard) than constructing and using a stringstream.
However, I've experienced problems with name conflicts between instantiations of this specialisation, and instantiations of the original function template, between competing shared libraries.
In order to get around that, I actually just give this function a whole new name entirely:
template <typename T>
inline std::string fast_lexical_cast(const T& arg)
{
return boost::lexical_cast<std::string>(arg);
}
template <>
inline std::string my_fast_lexical_cast(const int& arg)
{
char buffer[65];
if (!ltoa(arg, buffer, 10)) {
boost::throw_exception(boost::bad_lexical_cast(
typeid(std::string), typeid(int)
));
}
return std::string(buffer);
}
Usage: std::string myString = fast_lexical_cast<std::string>(42);
Disclaimer: this modification is reverse-engineered from Kirill's original SO code, not the version that I created and put into production from my company codebase. I can't think right now, though, of any other significant modifications that I made to it.

Something like this:
const int size = 12;
char buf[size+1];
buf[size] = 0;
int index = size;
bool neg = false
if (val < 0) { // Obviously don't need this if val is always positive.
neg = true;
val = -val;
}
do
{
buf[--index] = (val % 10) + '0';
val /= 10;
} while(val);
if (neg)
{
buf[--index] = '-';
}
return std::string(&buf[index]);

I use this:
void append_uint_to_str(string & s, unsigned int i)
{
if(i > 9)
append_uint_to_str(s, i / 10);
s += '0' + i % 10;
}
If You want negative insert:
if(i < 0)
{
s += '-';
i = -i;
}
at the beginning of function.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Is it possible to get hash values as compile-time constants? - c++

Related

str::find() function for the const char*

Is there a __builtin_constant_p() for Visual C++?

Condensing a do-while loop to a #define macro

How construct hash function for a user defined type?

Efficient way to convert int to string

Categories

Resources