There is this crc32 implementation that I like: CygnusX1 CRC32
It works well at compile time:
ctcrc32("StackOverflow");
But is it possible to use it at runtime:
void myfunction(const std::string& str)
{
uint32_t hash = ctcrc32(str);
// ...
}
So far I had to rewrite another (runtime) function but would prefer to use just one.
EDIT
I did tried with
ctcrc32(str.c_str())
But it doesn't work (** mismatched types ‘const char [len]’ and ‘const char*’ **). It seems to require a compile-time length.
Here is the implementation:
namespace detail {
// CRC32 Table (zlib polynomial)
static constexpr uint32_t crc_table[256] = { 0x00000000L, 0x77073096L, ... }
template<size_t idx>
constexpr uint32_t combine_crc32(const char * str, uint32_t part) {
return (part >> 8) ^ crc_table[(part ^ str[idx]) & 0x000000FF];
}
template<size_t idx>
constexpr uint32_t crc32(const char * str) {
return combine_crc32<idx>(str, crc32<idx - 1>(str));
}
// This is the stop-recursion function
template<>
constexpr uint32_t crc32<size_t(-1)>(const char * str) {
return 0xFFFFFFFF;
}
} //namespace detail
template <size_t len>
constexpr uint32_t ctcrc32(const char (&str)[len]) {
return detail::crc32<len - 2>(str) ^ 0xFFFFFFFF;
}
You cannot use it with a std::string without rewriting it. If you look at the main function:
template <size_t len>
constexpr uint32_t ctcrc32(const char (&str)[len]) {
return detail::crc32<len - 2>(str) ^ 0xFFFFFFFF;
}
...you see that it needs the length of the string at compile time because it uses it as a template parameter (detail::crc32<len - 2>).
ctcrc32 will only works with character arrays whose size is known at compile time (they don't have to be const or constexpr, but the size must be known).
I wrote an answer based on the original implementation to the linked question that allows both compile-time and runtime strings:
https://stackoverflow.com/a/48924267/2666289
Related
I'm making a simple constexpr string encoder, see below.
template<char...Chars>
struct encoder
{
constexpr static char encode(char c)
{
return c ^ size;
}
constexpr static size_t size = sizeof...(Chars);
constexpr static const char value[size + 1] = {encode(Chars)...,0};
};
template<typename T,T...Chars>
constexpr auto operator""_encode()
{
return encoder<Chars...>::value;
}
useage:
"aab"_encode
"123"_encode
i want to get char index from encode function,like this
constexpr static char encode(char c,uint32_t index)
{
return c ^ (size + index);
}
or like this
template<uint32_t index>
constexpr static char encode(char c)
{
return c ^ (size + index);
}
But I don't know how. Any one show me how to do that?
You can write the whole thing in a single constexpr function in C++17:
template<typename T, T...Chars>
constexpr auto operator""_encode()
{
constexpr std::size_t size = sizeof...(Chars);
std::array<char, size+1> ret = {}; // Maybe T instead of char?
int i = 0;
((ret[i] = Chars ^ (size + i), i++), ...);
ret[size] = 0;
return ret;
}
(I made it return a std::array instead of a builtin array for everyone's sanity.)
Here's a godbolt link, including one of your test inputs (it helps if you include the desired output, nobody likes poring over ASCII tables and xoring stuff by hand, even if I did that here):
https://godbolt.org/z/P8ABHM
Also, please don't use this to encrypt anything.
I have very big code-base, which uses __FILE__ extensively for logging. However, it includes full path, which is (1) not needed, (2) might case security violations.
I'm trying to write compile-time sub-string expression. Ended up with this solution
static constexpr cstr PastLastSlash(cstr str, cstr last_slash)
{
return *str == '\0' ? last_slash : *str == '/' ? PastLastSlash(str + 1, str + 1) : PastLastSlash(str + 1, last_slash);
}
static constexpr cstr PastLastSlash(cstr str)
{
return PastLastSlash(str, str);
}
// usage
PastLastSlash(__FILE__);
This works good, I've checked assembly code, line is trimmed in compile time, only file name is present in binary.
However, this notation is too verbose. I would like to use macro for this, but failed. Proposed example from the link above
#define __SHORT_FILE__ ({constexpr cstr sf__ {past_last_slash(__FILE__)}; sf__;})
doesn't work for MSVC compiler (I'm using MSVC 2017). Is there any other method do to so using c++17?
UPD1: clang trimmed by function https://godbolt.org/z/tAU4j7
UPD2: looks like it's possible to do trim on compile time using functions, but full string is swill be present in binary.
The idea is to create truncated array of characters, but it needs to use only compile time features. Generating data array through variadic template with pack of char forces compiler to generate data without direct relation to passed string literal. This way compiler cannot use input string literal, especially when this string is long.
Godbolt with clang: https://godbolt.org/z/WdKNjB.
Godbolt with msvc: https://godbolt.org/z/auMEIH.
The only problem is with template depth compiler settings.
First we define int variadic template to store sequence of indexes:
template <int... I>
struct Seq {};
Pushing int to Seq:
template <int V, typename T>
struct Push;
template <int V, int... I>
struct Push<V, Seq<I...>>
{
using type = Seq<V, I...>;
};
Creating sequence:
template <int From, int To>
struct MakeSeqImpl;
template <int To>
struct MakeSeqImpl<To, To>
{
using type = Seq<To>;
};
template <int From, int To>
using MakeSeq = typename MakeSeqImpl<From, To>::type;
template <int From, int To>
struct MakeSeqImpl : Push<From, MakeSeq<From + 1, To>> {};
Now we can make sequence of compile time ints, meaning that MakeSeq<3,7> == Seq<3,4,5,6,7>. Still we need something to store selected characters in array, but using compile time representation, which is variadic template parameter with characters:
template<char... CHARS>
struct Chars {
static constexpr const char value[] = {CHARS...};
};
template<char... CHARS>
constexpr const char Chars<CHARS...>::value[];
Next we something to extract selected characters into Chars type:
template<typename WRAPPER, typename IDXS>
struct LiteralToVariadicCharsImpl;
template<typename WRAPPER, int... IDXS>
struct LiteralToVariadicCharsImpl<WRAPPER, Seq<IDXS...> > {
using type = Chars<WRAPPER::get()[IDXS]...>;
};
template<typename WRAPPER, typename SEQ>
struct LiteralToVariadicChars {
using type = typename LiteralToVariadicCharsImpl<WRAPPER, SEQ> :: type;
};
WRAPPER is a type that contain our string literal.
Almost done. The missing part is to find last slash. We can use modified version of the code found in the question, but this time it returns offset instead of pointer:
static constexpr int PastLastOffset(int last_offset, int cur, const char * const str)
{
if (*str == '\0') return last_offset;
if (*str == '/') return PastLastOffset(cur + 1, cur + 1, str + 1);
return PastLastOffset(last_offset, cur + 1, str + 1);
}
Last util to get string size:
constexpr int StrLen(const char * str) {
if (*str == '\0') return 0;
return StrLen(str + 1) + 1;
}
Combining everything together using define:
#define COMPILE_TIME_PAST_LAST_SLASH(STR) \
[](){ \
struct Wrapper { \
constexpr static const char * get() { return STR; } \
}; \
using Seq = MakeSeq<PastLastOffset(0, 0, Wrapper::get()), StrLen(Wrapper::get())>; \
return LiteralToVariadicChars<Wrapper, Seq>::type::value; \
}()
Lambda function is to have nice, value-like feeling when using this macro. It also creates a scope for defining Wrapper structure. Generating this structure with inserted string literal using macro, leads to situation when the string literal is bounded to type.
Honestly I would not use this kind of code in production. It is killing compilers.
Both, in case of security reasons and memory usage, I would recommend using docker with custom, short paths for building.
You can using std::string_view:
constexpr auto filename(std::string_view path)
{
return path.substr(path.find_last_of('/') + 1);
}
Usage:
static_assert(filename("/home/user/src/project/src/file.cpp") == "file.cpp");
static_assert(filename("./file.cpp") == "file.cpp");
static_assert(filename("file.cpp") == "file.cpp");
See it compile (godbolt.org).
For Windows:
constexpr auto filename(std::wstring_view path)
{
return path.substr(path.find_last_of(L'\\') + 1);
}
With C++17, you can do the following (https://godbolt.org/z/68PKcsPzs):
#include <cstdio>
#include <array>
namespace details {
template <const char *S, size_t Start = 0, char... C>
struct PastLastSlash {
constexpr auto operator()() {
if constexpr (S[Start] == '\0') {
return std::array{C..., '\0'};
} else if constexpr (S[Start] == '/') {
return PastLastSlash<S, Start + 1>()();
} else {
return PastLastSlash<S, Start + 1, C..., (S)[Start]>()();
}
}
};
}
template <const char *S>
struct PastLastSlash {
static constexpr auto a = details::PastLastSlash<S>()();
static constexpr const char * value{a.data()};
};
int main() {
static constexpr char f[] = __FILE__;
puts(PastLastSlash<f>::value);
return 0;
}
With C++14, it's a bit more complicated because of the more limited constexpr (https://godbolt.org/z/bzGec5GMv):
#include <cstdio>
#include <array>
namespace details {
// Generic form: just add the character to the list
template <const char *S, char ch, size_t Start, char... C>
struct PastLastSlash {
constexpr auto operator()() {
return PastLastSlash<S, S[Start], Start + 1, C..., ch>()();
}
};
// Found a '/', reset the character list
template <const char *S, size_t Start, char... C>
struct PastLastSlash<S, '/', Start, C...> {
constexpr auto operator()() {
return PastLastSlash<S, S[Start], Start + 1>()();
}
};
// Found the null-terminator, ends the search
template <const char *S, size_t Start, char... C>
struct PastLastSlash<S, '\0', Start, C...> {
constexpr auto operator()() {
return std::array<char, sizeof...(C)+1>{C..., '\0'};
}
};
}
template <const char *S>
struct PastLastSlash {
const char * operator()() {
static auto a = details::PastLastSlash<S, S[0], 0>()();
return a.data();
}
};
static constexpr char f[] = __FILE__;
int main() {
puts(PastLastSlash<f>{}());
return 0;
}
With C++20, it should be possible to pass __FILE__ directly to the template instead of needing those static constexpr variables
I want to do something like this :
template<typename T>
const char * toStr(T num)
{
thread_local static char rc[someval*sizeof(T)] str = "0x000...\0"; // num of zeros depends on size of T
// do something with str
return str;
}
I'm guessing there's some template metaprogramming I'd have to do but I'm not sure where to start.
Edit:
I found a related question here: How to concatenate a const char* in compile time
But I don't want the dependency on boost.
Not sure to understand what do you want but... if you want that the str initial value is created compile time and if you accept that toStr() call and helper function (toStrH() in the following example) a C++14 example follows
#include <utility>
template <typename T, std::size_t ... I>
const char * toStrH (T const & num, std::index_sequence<I...> const &)
{
static char str[3U+sizeof...(I)] { '0', 'x', ((void)I, '0')..., '\0' };
// do someting with num
return str;
}
template <typename T>
const char * toStr (T const & num)
{ return toStrH(num, std::make_index_sequence<(sizeof(T)<<1U)>{}); }
int main()
{
toStr(123);
}
If you need a C++11 solution, substitute std::make_index_sequence() and std::index_sequence isn't difficult.
I recently starting playing around with template metaprogramming in C++, and been trying to evaluate the length of a C-style string.
I've had some success with this bit of code
template <const char *str, std::size_t index>
class str_length {
public:
static inline std::size_t val() {
return (str[index] != '\0') ? (1 + str_length<str, index + 1>::val()) : 0;
}
};
template <const char *str>
class str_length <str, 500> {
public:
static inline std::size_t val() {
return 0;
}
};
extern const char bitarr[] { "0000000000000000000" };
int main() {
std::cout << str_length<bitarr, 0>::val() << std::endl;
getchar();
return 0;
}
However, I had to set a "upper limit" of 500 by creating a specialization of str_length. Omitting that would cause my compiler to run indefinitely (presumably creating infinite specializations of str_length).
Is there anything I could do to not specify the index = 500 limit?
I'm using VC++2015 if that helps.
Oh, and I'm not using constexpr because VC++ doesn't quite support the C++14 extended constexpr features yet. (https://msdn.microsoft.com/en-us/library/hh567368.aspx#cpp14table)
The usual way to stop infinite template instantiation, in these situation, is by using specialization; which is orthogonal to constexpr-ness of anything. Reviewing the list of additional stuff that extended constexpr allows in C++14, I see nothing in the following example that needs extended constexpr support. gcc 6.1.1 compiles this in -std=c++11 compliance mode, FWIW:
#include <iostream>
template<const char *str, size_t index, char c> class str_length_helper;
template <const char *str>
class str_length {
public:
static constexpr std::size_t val()
{
return str_length_helper<str, 0, str[0]>::val();
}
};
template<const char *str, std::size_t index, char c>
class str_length_helper {
public:
static constexpr std::size_t val()
{
return 1+str_length_helper<str, index+1, str[index+1]>::val();
}
};
template<const char *str, std::size_t index>
class str_length_helper<str, index, 0> {
public:
static constexpr std::size_t val()
{
return 0;
}
};
static constexpr char bitarr[] { "0000000000000000000" };
int main() {
std::cout << str_length<bitarr>::val() << std::endl;
getchar();
return 0;
}
Note, however, that the character string itself must be constexpr. As the comments noted, this is of dubious practical use; but there's nothing wrong with messing around in this manner in order to get the hang of metaprogramming.
The key point is the use of specialization, and the fact that in order to be able to use str[index] as a template parameter, str must be constexpr.
As stated in this link:
There is no specialization for C strings. std::hash produces a hash of the value of the pointer (the memory address), it does not examine the contents of any character array.
Which means that with the same char* value, different hashcodes could be produced. For example, having this code:
//MOK and MOV are template arguments
void emit(MOK key, MOV value) {
auto h = hash<MOK>()(key);
cout<<"key="<<key<<" h="<<h<<endl;
...
This is the output produced by calling 4 times emit() on the same key (with MOK=char*) value (but 4 different tokens/string objects):
key=hello h=140311481289184
key=hello h=140311414180320
key=hello h=140311414180326
key=hello h=140311481289190
How can I obtain the same hash code for char*? I'd prefer not to use boost
There is of course the trivial (and slow) solution of creating a temporary std::string and hashing that one. If you don't want to do this, I'm afraid you will have to implement your own hash function. Sadly enough, the current C++ standard library doesn't provide general purpose hash algorithms disentangled from object-specific hash solutions. (But there is some hope this could change in the future.)
Suppose you had a function
std::size_t
hash_bytes(const void * data, std::size_t size) noexcept;
that would take an address and a size and return you a hash computed from the that many bytes following that address. With the help of that function, you could easily write
template <typename T>
struct myhash
{
std::size_t
operator()(const T& obj) const noexcept
{
// Fallback implementation.
auto hashfn = std::hash<T> {};
return hashfn(obj);
}
};
and then specialize it for the types you're interested in.
template <>
struct myhash<std::string>
{
std::size_t
operator()(const std::string& s) const noexcept
{
return hash_bytes(s.data(), s.size());
}
};
template <>
struct myhash<const char *>
{
std::size_t
operator()(const char *const s) const noexcept
{
return hash_bytes(s, std::strlen(s));
}
};
This leaves you only with the exercise of implementing hash_bytes. Fortunately, there are some fairly good hash functions that are rather easy to implement. My go-to algorithm for simple hashing is the Fowler-Noll-Vo hash function. You can implement it in five lines of code; see the linked Wikipedia article.
If you want to get a bit fancy, consider the following implementation. First, I define a generic template that can be specialized for any version of the FNV-1a hash function.
template <typename ResultT, ResultT OffsetBasis, ResultT Prime>
class basic_fnv1a final
{
static_assert(std::is_unsigned<ResultT>::value, "need unsigned integer");
public:
using result_type = ResultT;
private:
result_type state_ {};
public:
constexpr
basic_fnv1a() noexcept : state_ {OffsetBasis}
{
}
constexpr void
update(const void *const data, const std::size_t size) noexcept
{
const auto cdata = static_cast<const unsigned char *>(data);
auto acc = this->state_;
for (auto i = std::size_t {}; i < size; ++i)
{
const auto next = std::size_t {cdata[i]};
acc = (acc ^ next) * Prime;
}
this->state_ = acc;
}
constexpr result_type
digest() const noexcept
{
return this->state_;
}
};
Next, I provide aliases for the 32 and 64 bit versions. The parameters were taken from Landon Curt Noll's website.
using fnv1a_32 = basic_fnv1a<std::uint32_t,
UINT32_C(2166136261),
UINT32_C(16777619)>;
using fnv1a_64 = basic_fnv1a<std::uint64_t,
UINT64_C(14695981039346656037),
UINT64_C(1099511628211)>;
Finally, I provide type meta-functions to select a version of the algorithm given the wanted number of bits.
template <std::size_t Bits>
struct fnv1a;
template <>
struct fnv1a<32>
{
using type = fnv1a_32;
};
template <>
struct fnv1a<64>
{
using type = fnv1a_64;
};
template <std::size_t Bits>
using fnv1a_t = typename fnv1a<Bits>::type;
And with that, we're good to go.
constexpr std::size_t
hash_bytes(const void *const data, const std::size_t size) noexcept
{
auto hashfn = fnv1a_t<CHAR_BIT * sizeof(std::size_t)> {};
hashfn.update(data, size);
return hashfn.digest();
}
Note how this code automatically adapts to platforms where std::size_t is 32 or 64 bits wide.
I've had to do this before and ended up writing a function to do this, with essentially the same implementation as Java's String hash function:
size_t hash_c_string(const char* p, size_t s) {
size_t result = 0;
const size_t prime = 31;
for (size_t i = 0; i < s; ++i) {
result = p[i] + (result * prime);
}
return result;
}
Mind you, this is NOT a cryptographically secure hash, but it is fast enough and yields good results.
In C++17 you should use std::hash<std::string_view> which works seamlessly since const char* can be implicitly converted to it.
Since C++17 added std::string_view including a std::hash specialization for it you can use that to compute the hash value of a C-string.
Example:
#include <string_view>
#include <cstring>
static size_t hash_cstr(const char *s)
{
return std::hash<std::string_view>()(std::string_view(s, std::strlen(s)));
}
If you have to deal with a pre-C++17 compiler you can check your STL for an implementation defined hash function and call that.
For example, libstdc++ (which is what GCC uses by default) provides std::_Hash_bytes which can be called like this:
#include <functional>
// -> which finally includes /usr/include/c++/$x/bits/hash_bytes.h
#include <cstring>
static size_t hash_cstr_gnu(const char *s)
{
const size_t seed = 0;
return std::_Hash_bytes(s, std::strlen(s), seed);
}
You can use std::collate::hash
e.g. https://www.cplusplus.com/reference/locale/collate/hash/