std::hash value on char* value and not on memory address?

std::hash value on char* value and not on memory address? - c++

As stated in this link:
There is no specialization for C strings. std::hash produces a hash of the value of the pointer (the memory address), it does not examine the contents of any character array.
Which means that with the same char* value, different hashcodes could be produced. For example, having this code:
//MOK and MOV are template arguments
void emit(MOK key, MOV value) {
auto h = hash<MOK>()(key);
cout<<"key="<<key<<" h="<<h<<endl;
...
This is the output produced by calling 4 times emit() on the same key (with MOK=char*) value (but 4 different tokens/string objects):
key=hello h=140311481289184
key=hello h=140311414180320
key=hello h=140311414180326
key=hello h=140311481289190
How can I obtain the same hash code for char*? I'd prefer not to use boost

There is of course the trivial (and slow) solution of creating a temporary std::string and hashing that one. If you don't want to do this, I'm afraid you will have to implement your own hash function. Sadly enough, the current C++ standard library doesn't provide general purpose hash algorithms disentangled from object-specific hash solutions. (But there is some hope this could change in the future.)
Suppose you had a function
std::size_t
hash_bytes(const void * data, std::size_t size) noexcept;
that would take an address and a size and return you a hash computed from the that many bytes following that address. With the help of that function, you could easily write
template <typename T>
struct myhash
{
std::size_t
operator()(const T& obj) const noexcept
{
// Fallback implementation.
auto hashfn = std::hash<T> {};
return hashfn(obj);
}
};
and then specialize it for the types you're interested in.
template <>
struct myhash<std::string>
{
std::size_t
operator()(const std::string& s) const noexcept
{
return hash_bytes(s.data(), s.size());
}
};
template <>
struct myhash<const char *>
{
std::size_t
operator()(const char *const s) const noexcept
{
return hash_bytes(s, std::strlen(s));
}
};
This leaves you only with the exercise of implementing hash_bytes. Fortunately, there are some fairly good hash functions that are rather easy to implement. My go-to algorithm for simple hashing is the Fowler-Noll-Vo hash function. You can implement it in five lines of code; see the linked Wikipedia article.
If you want to get a bit fancy, consider the following implementation. First, I define a generic template that can be specialized for any version of the FNV-1a hash function.
template <typename ResultT, ResultT OffsetBasis, ResultT Prime>
class basic_fnv1a final
{
static_assert(std::is_unsigned<ResultT>::value, "need unsigned integer");
public:
using result_type = ResultT;
private:
result_type state_ {};
public:
constexpr
basic_fnv1a() noexcept : state_ {OffsetBasis}
{
}
constexpr void
update(const void *const data, const std::size_t size) noexcept
{
const auto cdata = static_cast<const unsigned char *>(data);
auto acc = this->state_;
for (auto i = std::size_t {}; i < size; ++i)
{
const auto next = std::size_t {cdata[i]};
acc = (acc ^ next) * Prime;
}
this->state_ = acc;
}
constexpr result_type
digest() const noexcept
{
return this->state_;
}
};
Next, I provide aliases for the 32 and 64 bit versions. The parameters were taken from Landon Curt Noll's website.
using fnv1a_32 = basic_fnv1a<std::uint32_t,
UINT32_C(2166136261),
UINT32_C(16777619)>;
using fnv1a_64 = basic_fnv1a<std::uint64_t,
UINT64_C(14695981039346656037),
UINT64_C(1099511628211)>;
Finally, I provide type meta-functions to select a version of the algorithm given the wanted number of bits.
template <std::size_t Bits>
struct fnv1a;
template <>
struct fnv1a<32>
{
using type = fnv1a_32;
};
template <>
struct fnv1a<64>
{
using type = fnv1a_64;
};
template <std::size_t Bits>
using fnv1a_t = typename fnv1a<Bits>::type;
And with that, we're good to go.
constexpr std::size_t
hash_bytes(const void *const data, const std::size_t size) noexcept
{
auto hashfn = fnv1a_t<CHAR_BIT * sizeof(std::size_t)> {};
hashfn.update(data, size);
return hashfn.digest();
}
Note how this code automatically adapts to platforms where std::size_t is 32 or 64 bits wide.

I've had to do this before and ended up writing a function to do this, with essentially the same implementation as Java's String hash function:
size_t hash_c_string(const char* p, size_t s) {
size_t result = 0;
const size_t prime = 31;
for (size_t i = 0; i < s; ++i) {
result = p[i] + (result * prime);
}
return result;
}
Mind you, this is NOT a cryptographically secure hash, but it is fast enough and yields good results.

In C++17 you should use std::hash<std::string_view> which works seamlessly since const char* can be implicitly converted to it.

Since C++17 added std::string_view including a std::hash specialization for it you can use that to compute the hash value of a C-string.
Example:
#include <string_view>
#include <cstring>
static size_t hash_cstr(const char *s)
{
return std::hash<std::string_view>()(std::string_view(s, std::strlen(s)));
}
If you have to deal with a pre-C++17 compiler you can check your STL for an implementation defined hash function and call that.
For example, libstdc++ (which is what GCC uses by default) provides std::_Hash_bytes which can be called like this:
#include <functional>
// -> which finally includes /usr/include/c++/$x/bits/hash_bytes.h
#include <cstring>
static size_t hash_cstr_gnu(const char *s)
{
const size_t seed = 0;
return std::_Hash_bytes(s, std::strlen(s), seed);
}

You can use std::collate::hash
e.g. https://www.cplusplus.com/reference/locale/collate/hash/

Related

C++20 string literal template argument working example

Could someone post a minimal reproducible example of C++20's feature string template as template argument?
this one from ModernCpp does not compile:
template<std::basic_fixed_string T>
class Foo {
static constexpr char const* Name = T;
public:
void hello() const;
};
int main() {
Foo<"Hello!"> foo;
foo.hello();
}
I've managed to write a working solution based on this Reddit post:
#include <iostream>
template<unsigned N>
struct FixedString
{
char buf[N + 1]{};
constexpr FixedString(char const* s)
{
for (unsigned i = 0; i != N; ++i) buf[i] = s[i];
}
constexpr operator char const*() const { return buf; }
// not mandatory anymore
auto operator<=>(const FixedString&) const = default;
};
template<unsigned N> FixedString(char const (&)[N]) -> FixedString<N - 1>;
template<FixedString Name>
class Foo
{
public:
auto hello() const { return Name; }
};
int main()
{
Foo<"Hello!"> foo;
std::cout << foo.hello() << std::endl;
}
Live Demo
but does provide a custom implementation for a fixed string. So what should be the state-of-the-art implementation by now?

P0259 fixed_string has been retired on the basis that most of its use cases are better achieved via P0784 More constexpr containers (aka constexpr destructors and transient allocation) - i.e. being able to use std::string itself within constexpr context.
Of course, even if you can use std::string in constexpr, that doesn't make it usable as an NTTP, but it doesn't look like we're going to get a structural string class into the Standard any time soon. I would recommend using your own structural string class for now, being ready to alias it to an appropriate third-party class should one emerge in a popular library, or to a standard one if and when that happens.

C++ Get Name of Function from Template

for Debugging purposes I'd like to extract the name of the function from a template argument. However I'm only getting the functions signature not an actual name.
namespace internal
{
static const unsigned int FRONT_SIZE = sizeof("internal::GetTypeNameHelper<") - 1u;
static const unsigned int BACK_SIZE = sizeof(">::GetTypeName") - 1u;
template<typename T>
struct GetTypeNameHelper
{
static const char* GetTypeName(void)
{
#ifdef __GNUC__
static const size_t size = sizeof(__PRETTY_FUNCTION__);
static char typeName[size] = { };
memcpy(typeName, __PRETTY_FUNCTION__, size - 1u);
#else
static const size_t size = sizeof(__FUNCTION__) - FRONT_SIZE - BACK_SIZE;
static char typeName[size] =
{};
memcpy(typeName, __FUNCTION__ + FRONT_SIZE, size - 1u);
#endif //__GNUC__
return typeName;
}
};
} //namespace internal
template<typename T>
const char* GetTypeName(void)
{
return internal::GetTypeNameHelper<T>::GetTypeName();
}
Calling this from an own make function
template<typename Func_T, typename ... Args>
CExtended_Function<Args...> Make_Extended_Function(Func_T f)
{
std::function<void(Args...)> func(f);
const char* pFunc_Name = NCommonFunctions::GetTypeName<Func_T>();
CExtended_Function<Args...> res(f, func_name);
return res;
}
with
void Test_Func();
void foo()
{
Make_Extended_Function(Test_Func);
}
Gives me only the function signature.
... [with T = void (*)()]...
However I'd like to get the function name (in this case "Test_Func")
I thought about using makros but I'm not sure how to implement the Args... Part in Makros. Do you have an idea on how to solve this? I'd like to avoid using RTTI.

Functions aren't valid template arguments - your template argument here is the type of a pointer to the function, not the function itself - so this is completely impossible. There is also no portable way to get the name of a particular function at compile time either, at least at the moment (it's possible that this will be possible in the future through compile time reflection, but that's going to be C++2y (23?) at the earliest).

With Macro, you can do (I also use CTAD from C++17)
template<typename F>
auto Make_Extended_Function_Impl(F f, const std::string& name)
{
std::function func(f);
CExtended_Function res(f, name);
return res;
}
#define Make_Extended_Function(f) Make_Extended_Function(f, #f)

Had to specify upper limit while evaluating the length of C-style string through template metaprogramming

I recently starting playing around with template metaprogramming in C++, and been trying to evaluate the length of a C-style string.
I've had some success with this bit of code
template <const char *str, std::size_t index>
class str_length {
public:
static inline std::size_t val() {
return (str[index] != '\0') ? (1 + str_length<str, index + 1>::val()) : 0;
}
};
template <const char *str>
class str_length <str, 500> {
public:
static inline std::size_t val() {
return 0;
}
};
extern const char bitarr[] { "0000000000000000000" };
int main() {
std::cout << str_length<bitarr, 0>::val() << std::endl;
getchar();
return 0;
}
However, I had to set a "upper limit" of 500 by creating a specialization of str_length. Omitting that would cause my compiler to run indefinitely (presumably creating infinite specializations of str_length).
Is there anything I could do to not specify the index = 500 limit?
I'm using VC++2015 if that helps.
Oh, and I'm not using constexpr because VC++ doesn't quite support the C++14 extended constexpr features yet. (https://msdn.microsoft.com/en-us/library/hh567368.aspx#cpp14table)

The usual way to stop infinite template instantiation, in these situation, is by using specialization; which is orthogonal to constexpr-ness of anything. Reviewing the list of additional stuff that extended constexpr allows in C++14, I see nothing in the following example that needs extended constexpr support. gcc 6.1.1 compiles this in -std=c++11 compliance mode, FWIW:
#include <iostream>
template<const char *str, size_t index, char c> class str_length_helper;
template <const char *str>
class str_length {
public:
static constexpr std::size_t val()
{
return str_length_helper<str, 0, str[0]>::val();
}
};
template<const char *str, std::size_t index, char c>
class str_length_helper {
public:
static constexpr std::size_t val()
{
return 1+str_length_helper<str, index+1, str[index+1]>::val();
}
};
template<const char *str, std::size_t index>
class str_length_helper<str, index, 0> {
public:
static constexpr std::size_t val()
{
return 0;
}
};
static constexpr char bitarr[] { "0000000000000000000" };
int main() {
std::cout << str_length<bitarr>::val() << std::endl;
getchar();
return 0;
}
Note, however, that the character string itself must be constexpr. As the comments noted, this is of dubious practical use; but there's nothing wrong with messing around in this manner in order to get the hang of metaprogramming.
The key point is the use of specialization, and the fact that in order to be able to use str[index] as a template parameter, str must be constexpr.

how do I use type_traits or template function specialization to consolidate template methods

I am trying to consolidate a number of very similar function methods from a class similar to the one shown below and I thought that the best way to efficiently implement this, would be through the use templates coupled with either template function specialization or alternatively type-traits. I am a newbie to template specialization and type-traits but I understand the basic concepts and that is why I am asking for some guidance on the details. Anyway as a starting point my class is a smart buffer class that has many similar method signatures to those listed below.
class OldSafeBuffer {
public:
intmax_t writeAt(const intmax_t& rIndex, const uint32_t val32);
intmax_t writeAt(const intmax_t& rIndex, const int32_t val32);
intmax_t readFrom(const intmax_t& rIndex, uint32_t& rVal32);
intmax_t readFrom(const intmax_t& rIndex, int32_t& rVal32);
intmax_t writeAt(const intmax_t& rIndex, const uint16_t val16);
intmax_t writeAt(const intmax_t& rIndex, const int16_t val16);
intmax_t readFrom(const intmax_t& rIndex, uint16_t& rVal16);
intmax_t readFrom(const intmax_t& rIndex, int16_t& rVal16);
intmax_t read(uint32_t& rVal32);
intmax_t read(int32_t& rVal32);
intmax_t read(uint16_t& rVal16);
intmax_t read(int16_t& rVal16);
protected:
// Actual memory storage.
std::unique_ptr<char[]> mBuffer;
// Buffer length
intmax_t mBufferLength;
// Represents the largest byte offset referenced.
// Can be used to retrieve written length of buffer.
intmax_t mHighWaterMark;
// If set, caller wanted to pack data in network-byte-order.
bool mPackNBO;
// Set on construction, determines whether value needs to be byte-swapped.
bool mSwapNeeded;
// Used for file compatibility
intmax_t mPosition;
};
I thought that this would be a perfect candidate for conversion to use template functions as these functions are very similar and I had a lot of repeated code in each method. The difference between methods was mainly the sign and the size of the 16 or 32 bit value argument.
Anyway to consolidate the readFrom methods I put the following method together. I also did similar things for the write methods. These are shown in the compiling live example.
/**
* Read value (signed or unsigned) from buffer at given byte offset.
*
* #param rIndex [in]
* #param rVal [out]
*
* #return BytesRead or -1 on error
*/
template <typename T>
inline intmax_t readFrom(const intmax_t& rIndex, T& rVal)
{
if ((rIndex + static_cast<intmax_t>(sizeof(T))) <= mBufferLength) {
T* pVal = (T *)&mBuffer[rIndex];
rVal = *pVal;
// #JC Partial Template Specialization for 16 bit entities?
if (sizeof(rVal) > sizeof(int16_t)) {
SWAP32(rVal);
} else {
SWAP16(rVal);
}
mPosition = rIndex + sizeof(T);
return sizeof(rVal);
}
return -1;
}
As can be seen from my comment, I still need to know the size of 'T& rVal' argument in order to decide whether to do a SWAP32 or SWAP16 on the argument. This was why I thought that type_traits could come in useful rather than having to put in a runtime check to compare the size of the argument.
I think that I am on the right track but I cannot figure out how to use the type_traits to check and do certain things depending on the argument type. I thought that alternatively I could use template method specialization to do special things to 16 bit arguments, but I think that would not save much effort as I would also have to specialize on both the signed adn unsigned variants of the 16 bit argument type (assuming the non specialized version was for 32 bit value arguments). Any help figuring this out would be much appreciated.

You may use something like:
template<typename T, std::size_t N = sizeof(T)> struct Swap;
template<typename T> struct Swap<T, 1> {
void operator() (T&) const { /* Do nothing*/ }
};
template<typename T> struct Swap<T, 2> {
void operator() (T& val) const { SWAP16(val); }
};
template<typename T> struct Swap<T, 4> {
void operator() (T& val) const { SWAP32(val); }
};
And then call it:
Swap<T>()(rVal);
So in context:
if (sizeof(T) > sizeof(int16_t)) {
SWAP32(val);
} else {
SWAP16(val);
}
can be written as
Swap<T>()(val);

You could use template specialization to do a specialized swap function like the example below:
template<typename T>
struct Swap;
template<>
struct Swap<int16_t> {
static void swap(int16_t val) { SWAP16(val); }
};
template<>
struct Swap<int32_t> {
static void swap(int32_t val) { SWAP32(val); }
};
Then you could call it in your code like this:
template <typename T>
inline intmax_t readFrom(const intmax_t& rIndex, T& rVal)
{
if ((rIndex + static_cast<intmax_t>(sizeof(T))) <= mBufferLength) {
T* pVal = (T *)&mBuffer[rIndex];
rVal = *pVal;
Swap<T>::swap(rVal);
mPosition = rIndex + sizeof(T);
return sizeof(rVal);
}
return -1;
}

You may just define the swap method for your 4 types:
inline void swap_endianess(int16_t& value) { SWAP16(value); }
inline void swap_endianess(uint16_t& value) { SWAP16(value); }
inline void swap_endianess(int32_t& value) { SWAP32(value); }
inline void swap_endianess(uint32_t& value) { SWAP32(value); }
and let the template function dispatches to the correct one.
so instead of
if (sizeof(T) > sizeof(int16_t)) {
SWAP32(val);
} else {
SWAP16(val);
}
just call
swap_endianess(val);

How to specialize a template function by static array of structures

I am a bit in stuck and need a help from C++ template guru. There is a template struct:
template<typename T, typename ID>
struct TypeMapping
{
T Type;
char* Name;
ID Id;
};
and a few template functions like this:
template<typename T, typename ID>
bool TryGetTypeByNameImp(const TypeMapping<T, ID> map[], size_t mapSize,
const char* name, T& type)
{
for (size_t i = 0; i < mapSize; i++)
{
if (strcmp(map[i].Name, name) == 0)
{
type = map[i].Type;
return true;
}
}
return false;
}
Map (the first parameter) is defined as (there are a few similar maps)
namespace audio
{
const TypeMapping<Type, AMF_CODEC_ID> Map[] =
{
{AAC, "aac", AMF_CODEC_AAC},
{MP3, "mp3", AMF_CODEC_MP3},
{PCM, "pcm", AMF_CODEC_PCM_MULAW}
};
const size_t MapSize = sizeof(Map)/sizeof(Map[0]);
}
Map is passed to a function as an argument and I am looking for how to pass it as template parameter so I can use functions like in this sample:
audio::Type type;
bool r = TryGetTypeByNameImp<audio::Map>("aac", type);
The only solution I found it is to define a struct which holds static Map and MapSize and use the struct as template parameter but I do not like this solution and I am looking for another one. Does anybody know how to do this?

bool r = TryGetTypeByNameImp<audio::Map>("aac", type);
This is trying to use audio::Map as a type – but it isn’t, it’s a variable. Just pass it to the function as a normal argument:
bool r = TryGetTypeByNameImp(audio::Map, "aac", type);
That said, I have three remarks about your code:
Be aware that declaring a function argument as an array (x[]) does in reality declare it as a pointer. Your code uses this correctly, but using the array syntax is misleading. Use a pointer instead.
This code is slightly too C-heavy for my taste. While I agree that using raw C-strings is appropriate here, your usage of char* is illegal in C++11, and deprecated in C++03 (since you are pointing to string literals). Use char const*. Furthermore, I’d suggest using a std::string argument in the function, and using the comparison operator == instead of strcmp.
You are using an out-parameter, type. I abhor this technique. If you want to return a value, use the return type. Since you also return a success value, use a pair as the return type, unless there’s a very compelling reason not to:
template<typename T, typename ID>
std::pair<bool, T> TryGetTypeByNameImp(
const TypeMapping<T, ID> map[], size_t mapSize,
const char* name)
{
for (size_t i = 0; i < mapSize; i++)
if (strcmp(map[i].Name, name) == 0)
return std::make_pair(true, map[i].Type);
return std::make_pair(false, T());
}
Ah, and I’d also consider using a std::vector or std::array here instead of a C array. Then you don’t need to manually shlep the array size around through all the functions which use the array.

You can certainly use the array itself (well, a pointer to it) as a template parameter:
#include <iostream>
template<typename T> struct S { T t; };
S<int> s[] = { { 21 }, { 22 } };
template<typename T, size_t n, S<T> (*m)[n]> void f() { std::cout << (*m)[n - 1].t; }
int main() {
f<int, 2, &s>();
}
The problem here is that you can't use template argument deduction on the length of the array nor on its type, so both must be supplied as template parameters in addition to the array itself. I really think that passing in a struct or, say a vector would be the better solution, as you've no doubt already explored:
#include <vector>
#include <iostream>
template<typename T> struct S { T t; };
std::vector<S<int>> s{ { 21 }, { 22 } };
template<typename T, std::vector<S<T>> *v> void f() { std::cout << v->back().t; }
int main() {
f<int, &s>();
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::hash value on char* value and not on memory address? - c++

In C++17 you should use std::hash<std::string_view> which works seamlessly since const char* can be implicitly converted to it.

You can use std::collate::hash e.g. https://www.cplusplus.com/reference/locale/collate/hash/

Related

C++20 string literal template argument working example

C++ Get Name of Function from Template

Had to specify upper limit while evaluating the length of C-style string through template metaprogramming

how do I use type_traits or template function specialization to consolidate template methods

How to specialize a template function by static array of structures

Categories

Resources