Ways to distinguish string literal and runtime generated string

Ways to distinguish string literal and runtime generated string - c++

Suppose I have a function that takes a string as input:
SomeOutputType f_impl(const char* s);
Most call sites just use string literals as input, e.g. f("Hello, world"). Suppose I have implemented the following function to compute the result at compile time
template <char...> SomeOutputType f_impl();
My question is, is there a way to let the call sites like f("Hello, world") calls the templated form, while for general call sites like string s="Hello, world"; f(s.c_str()); calls the general form? For clarification, auto s = "Hello, world"; f(s); don't have to call the templated form because s is now a variable and no longer a compile time constant.
A useful case for this question is to optimize printf. In most cases the format will be string literals so a lot of things can be done at compile time to optimize things, instead of parsing the format at runtime.

No, a string literal like "foo" has the type const char[S + 1] where S is the number of characters you wrote. It behaves like an array of that type with no special rules.
In C++03, there was a special rule that said that a string literal could convert to char*. That allowed you to say
#define isStringLiteral(X) \
isConvertibleToCharStar(X) && hasTypeConstCharArray(X)
For example isStringLiteral(+"foo") would yield false, and isStringLiteral("foo") would yield true. Even this possibiliy would not have allowed you to call a function with a string literal argument and behave differently.
C++11 removed that special conversion rule and string literals behave like any other arrays. In C++11 as a dirty hack you can compose some macros, matching some simple string literals without handling escape sequences
constexpr bool isStringLiteral(const char *x, int n = 0) {
return *x == '"' ?
n == 0 ?
isStringLiteral(x + 1, n + 1)
: !*(x + 1)
: (*x && n != 0 && isStringLiteral(x + 1, n + 1));
}
#define FastFun(X) \
(isStringLiteral(#X) ? fConstExpr(X, sizeof(X) - 1) : f(X))

While I haven't tested this, I think if you just declare the function constexpr and compile with high optimization, the compiler will compute at compile time whenever possible. As a bonus, you don't need to write the code twice. On the other hand, you have to write it once in constexpr style.

If I understand the question correctly, I actually think something like this is possible using a function overload. Here's an article that shows the basic idea. In your case I think it would be sufficient to have the following two overloads:
void f(char const *);
template<unsigned int N>
void f(char const (&)[N]);
The latter should be invoked when the string is a string literal, the latter at other times. If the compiler is sufficiently good at optimizing then calls to the latter may be evaluated at compile time.
EDIT:
Alright, it bothered me that the above solution didn't work, so I did some playing around and I think I came up with a solution:
#include <string>
#include <boost/utility/enable_if.hpp>
template<typename T>
struct is_string_literal {
enum { value = false };
};
template<unsigned int N>
struct is_string_literal<char const (&)[N]> {
enum { value = true };
};
template<typename T>
typename boost::disable_if<is_string_literal<T> >::type
foo(T) {
std::cout << "foo1" << std::endl;
}
template<int N>
void foo(char const (&)[N]) {
std::cout << "foo2" << std::endl;
}
int main( ) {
std::string bar = "blah";
char const str[] = "blah";
foo(str);
foo("blah");
foo(bar.data());
}
The output (on GCC 4.4 with -O3) is:
foo2
foo2
foo1
I admit that I don't completely understand why this works when the previous solution didn't. Maybe there's something about overload resolution that I don't completely understand.

Related

What is the proper definition for a constexpr function that take a character array?

I'm writing a hashing function to help speed up string comparisons.
My codebase compares strings against a lot of const char[] constants, and it would be ideal if I could work with hashes instead. I went ahead and translated xxHash to modern C++, and I have a working prototype that does work at compile time, but I'm not sure what the function definition should be for the main hashing function.
At the moment, I have this:
template <size_t arr_size>
constexpr uint64_t xxHash64(const char(data)[arr_size])
{...}
This does work, and I am able to do a compile time call like this
constexpr char myString[] = "foobar";
constexpr uint64_t hashedString = xxHash64<sizeof myString>(myString);
[Find a minimal example here]
All good so far, but I would like to add a user-defined literal wrapper function for some eye candy, and this is where the problem lies.
UDLs come with a fixed prototype, as specified here
The Microsoft doc stipulates "Also, any of these operators can be defined as constexpr".
But when I try to call my hashing function from a constexpr UDL:
constexpr uint64_t operator "" _hashed(const char *arr, size_t size) {
return xxHash64<size>(arr);
}
function "xxHash64" cannot be called with the given argument list
argument types are: (const char*)
And the error does make sense. My function expects a character array, and instead it gets a pointer.
But if I were to modify the definition of my xxHash64 function to take a const char *, I can no longer work in a constexpr context because the compiler needs to resolve the pointer first, which happens at runtime.
So am I doing anything wrong here, or is this a limitation of UDLs or constexpr functions as a whole?
Again, I'm not 100% sure the templated definition at the top is the way to go, but I'm not sure how else I could read characters from a string at compile time.
I'm not limited by any compiler version or library. If there is a better way to do this, feel free to suggest.

there is no problem to call constexpr function with constexpr pointer as constant expression
constexpr uint64_t xxHash64(const char* s){return s[0];}
constexpr uint64_t operator "" _g(const char *arr,std::size_t){
return xxHash64(arr);
}
int main()
{
xxHash64("foo");
constexpr auto c = "foobar"_g;
return c;
}
would just work fine.

with c++20, you can also get the size as constant expression with string literal operator template.
#include <cstdint>
template <std::size_t arr_size>
constexpr std::uint64_t xxHash64(const char(&data)[arr_size]){
return data[0];
}
// template <std::size_t N> // can also be full class template (with CTAD)
struct hash_value{
std::uint64_t value;
template <std::size_t N>
constexpr hash_value(const char(&p)[N]):value(xxHash64(p)){}
};
template < hash_value v >
constexpr std::uint64_t operator ""_hashed() { return v.value; }
int main()
{
constexpr auto v = "foobar"_hashed;
return v;
}

Is it possible with C++20 to have a constexpr function return a tuple of types that have static constexpr array's with a value passed though a macro?

After two or three days of trying, I had to give up and wrote a "minimal" test case I hope demonstrates the problem.
What I need is a method to convert string-literals, that are passed as macro arguments without quotes, into strings (catenated with a prefix) that are accessible in a constexpr environment (see https://wandbox.org/permlink/Cr6j6fXemsQRycHI for the Real Code(tm)); that means, they (the macro arguments) should be stringified and then converted into either a type (e.g. template<... 'h', 'e', 'l', 'l', 'o', ...>) or into a static constexpr array<char, N> of a unique type that is passed instead (e.g template<... A<1> ...>, where A<1>::str is a static constexpr array<char, 6> with the contents 'h', 'e', 'l', 'l', 'o', '\0'.
I strongly prefer the latter, and the former only if the latter isn't possible.
To demonstrate the exact problem/requirement in a short test case I came up with the following:
Some headers...
#include <array>
#include <tuple>
#include <cassert>
#include <string>
#include <iostream>
Then for the sake of demonstrating how the end-result should behave:
template<int I>
struct A;
template<>
struct A<0>
{
static constexpr auto str = std::to_array("abc"); // The string-literal "abc" may NOT appear here.
// The code should work for any macro argument
// (though for this test case you may assume all
// string literals are three chars).
};
template<>
struct A<1>
{
static constexpr auto str = std::to_array("def"); // Same.
};
constexpr auto f(char const* s0, char const* s1)
{
return std::tuple<
A<0>, // The 0, because this is the first argument, makes the type (A<0>) unique, and therefore
// a static constexpr can be part of that unique type that contains the string s0.
A<1> // Same for 1 and A<1>.
>{};
}
#define STR(arg) #arg
#define STRINGIFY(arg) STR(arg)
#define MEMBER(arg) STRINGIFY(arg)
And finally the rest of the code that hopefully enforces everything I need
the above to do:
//=====================================================================
// NOTHING BELOW THIS LINE MAY BE CHANGED.
struct C
{
static constexpr auto x = f(
MEMBER(abc),
MEMBER(def)
);
};
int main()
{
// The type returned by f() is a tuple.
using xt = decltype(C::x);
// Each element of that tuple must be a type...
using e0 = std::tuple_element_t<0, xt>;
using e1 = std::tuple_element_t<1, xt>;
// ... that defines a static constexpr array<> 'str'.
constexpr std::array a0 = e0::str;
constexpr std::array a1 = e1::str;
std::string s0{a0.begin(), a0.end()}; // Note that the array str includes a terminating zero.
std::string s1{a1.begin(), a1.end()};
std::cout << "s0 = \"" << s0 << "\"\n";
std::cout << "s1 = \"" << s1 << "\"\n";
// ... that has the value that was passed as macro argument.
assert(s0.compare("abc") && s0[3] == '\0');
assert(s1.compare("def") && s1[3] == '\0');
}
The second code block needs some obvious fixes:
it contains hard-coded strings for "abc" and "def" which are supposed to come from the macro arguments passed to MEMBER at the top of the third block.
The arguments passed to f() are not even used.
The idea here is that each argument of f() results in an element type of the returned tuple - for the sake of simplicity I made this fixed: just two arguments.
Each element of the tuple is guaranteed to be a unique type, but that uniqueness depends on the fact that it contains an integer template parameter that is incremented for each argument; in the real code the uniqueness is guaranteed further by only invoking f() once per unique (user) class (C above); but since this in this test case f() is only invoked once anyway, I left that out. Literally returning std::tuple<A<0>, A<1>> is therefore in theory the goal.
Since each tuple element is a unique type, in theory they can contain a static constexpr array with the argument that was passed to the MEMBER macro as their content. However, I don't see how this is possible to achieve.
What I really need is "encoded" in the third block: if that works for any identifier string (i.e. no spaces, if that matters) as macro arguments (after also replacing the test strings at the end), then the interface of block two should teach me how to do this.
The complete test case can be found online here: https://wandbox.org/permlink/vyPK9qktAzcdP3wt
EDIT
Thanks to pnda's solution, we now have an answer; I just made some tiny changes so that the third code block can stay as-is. With the following as second code block it compiles and works!
struct TemplateStringLiteral {
std::array<char, N> chars;
consteval TemplateStringLiteral(std::array<char, N> literal) : chars(literal) { }
};
template<TemplateStringLiteral literal>
struct B {
static constexpr auto str = literal.chars;
};
template<TemplateStringLiteral s>
struct Wrap
{
};
template <TemplateStringLiteral s0, TemplateStringLiteral s1>
consteval auto f(Wrap<s0>, Wrap<s1>)
{
return std::tuple<
B<s0>,
B<s1>
>{};
}
#define STR(arg) #arg
#define STRINGIFY(arg) STR(arg)
#define MEMBER(arg) Wrap<std::to_array(STRINGIFY(arg))>{}

It is possible to store a string literal inside of a template argument. I wrote a class that does exactly that a year or so ago. This allows the tuple creation to be constexpr while also offering a quite nice way of passing the strings. This class can also be used to differentiate between two types using a string without modifying the rest of the templated class, which is very useful if you're using typeless handles.
template <std::size_t N>
struct TemplateStringLiteral {
char chars[N];
consteval TemplateStringLiteral(const char (&literal)[N]) {
// Does anyone know of a replacement of std::copy_n? It's from
// <algorithm>.
std::copy_n(literal, N, chars);
}
};
Using this class, it is possible for us to pass the tuple creation this class with a string literal attached to it. This does, sadly, slightly modify your C::x from using the MEMBER define for the function arguments to using the MEMBER define for the template arguments. It is also perfectly possible to just use normal string literals in the template arguments.
struct C
{
// Essentially just f<"abc", "def">();
static constexpr auto x = f<MEMBER(abc), MEMBER(def)>();
};
For the function f we now want to take the parameters as template arguments. So we'll now use the TemplateStringLiteral class to take in the string literals, while also making it a template parameter pack to allow for any amount of parameters.
template <TemplateStringLiteral... literal>
consteval auto f() {
// I'll talk about A in a second. It just holds our string
return std::tuple<A<literal>...> {};
}
Now that we've got a function f that can create a std::tuple from some amount of string literals passed in through template parameters, we'll just need to define the class A that the tuple holds. We can't pass the string literals directly to std::tuple as it does not have something akin to TemplateStringLiteral. Defining class A is very simple, as we only need a str field holding our string literal.
template<TemplateStringLiteral literal>
struct A {
static constexpr auto str = std::to_array(literal.chars);
};
So, using TemplateStringLiteral we've got an implementation that's about 16 lines of C++ and is pretty easily understandable imo.

Strange syntax for passing a const char parameter to deduce length as template parameter. What is happening?

I found this gem in our codebase.
constexpr bool ConstexprStrBeginsWithImpl(const char* str, const char* subStr)
{
return !subStr[0] ? true : (str[0] == subStr[0] && ConstexprStrBeginsWithImpl(str + 1, subStr + 1));
}
template<int N, int M>
constexpr bool ConstexprStrBeginsWith(const char(&str)[N], const char(&subStr)[M])
{
static_assert(M <= N, "The substring to test is longer than the total string");
return ConstexprStrBeginsWithImpl(str, subStr);
}
Now I get what it does (comparing two constant strings as a constexpr), but what is this strange calling syntax const char(&str)[N]? to deduce the template int-parameter with the length of a constant char? How does this work? How is that a legal syntax? :-O
I thought you had to declare a constant char array parameter like this: const char str[N]?
If I use that - to me more logical - version, then my compilers (VCL and GCC) complain that they can't deduce the int-parameter N when using the constexpr as a parameter to another template with a bool. For example in this scenario:
template<bool B> struct Yada { int i = 23; };
template<> struct Yada<true> { int i = 42; };
int main()
{
Yada<ConstexprStrBeginsWith("foobar", "foo")> y;
std::cout << y.i;
}
This only compiles, if I declare str and subStr via const char(&str)[N] instead of just const char str[N].
So.... I am happy that it compiles and it looks certainly clever, but.. is this legal syntax? What is declared here? :-O. #justcurious
Greetings, Imi.

Thanks to #Thomas, #Jarod42 and #largest_prime_is_463035818, I could piece the puzzle together:
The & before the "str" is to declare a reference to an char-array instead of a char array by-value. The parenthesis are needed due to binding rules.
The reason that the template can not deduce the size of the char array if passed by-value is, that these old c-arrays are decaying to pointers, whereas references to C-arrays are never decaying. Jarod42 has a nice example of how to use templates instead - if (for some reason) you don't like to use references to c-arrays.

c++ constexpr pointer casting

The code below explains the problem
constexpr int x = *(reinterpret_cast<const int*>("abcd")); //wrong
constexpr int y = ("abcd")[0]*1l + ("abcd")[1]*256l + ("abcd")[2]*256*256l + ("abcd")[3]*256*256*256l; //ok
How can I do such type casting in constexpr expression?
UPDATE
The reason of doing this:
I'm writing set of templates for manipulating c-style strings in compile time. It uses such representation of string:
template<char... args> struct String {static const char data[sizeof...(args)+1];};
template<char... args> constexpr const char String<args...>::data[sizeof...(args)+1] = {args...,0};
So in my program I can do this:
String<'H','e','l','l','o',' ','w','o','r','l','d','!'>
But I can not do this:
String<"Hello world!">
I have a partial solution for short srtings:
template<int N,char... chrs> struct Int2String : Int2String<(N>>8),N,chrs...> {};
template<char... chrs> struct Int2String<0,chrs...> : String<chrs...> {};
...
Int2String<'Hell'>
It uses c multi-character literals, so works only with strings of length 4 or less, depends on platform, but looks much better. I'm ok with this restrictions, but sometimes I want to use string, defined in library, so I can't change " quotes to ' qoutes. In the example above i'm trying to convert "abcd" to 'abcd'. I expect this to have same representation in memory, so pointer casting looks like a good idea. But I can't do this in compile time.

Because things like:
("abcd")[0]
simply equate to:
'a'
which gets integral promotion in the expression:
'a' * 1l
But things like:
(reinterpret_cast<const int*>("abcd")
are trying to get a pointer to static memory which is only known at link time.

Any way to make parameterized user defined literals?

A little while ago I had an idea about "parameterized" user-defined literals and was wondering if there is any way to do this in the current C++ standard.
Basically, the idea is to have a user-defined literal whose behaviour can be tweaked according to some parameters. As a simple example, I chose a "fixed-point" literal which turns a floating-point number into an integer; the parameter is the precision in terms of the number of decimal places.
This is just an exercise for now, since I'm not sure how or if this would be useful in a real application.
My first idea went something like this:
namespace fp_impl {
constexpr int floor(long double n) {
return n;
}
constexpr int pow10(int exp) {
return exp == 0 ? 1 : 10 * pow10(exp - 1);
}
template<int i>
constexpr int fixed_point(long double n) {
return floor(n * pow10(i));
}
namespace fp2 {
constexpr int operator"" _fp (long double n) {
return fixed_point<2>(n);
}
}
namespace fp4 {
constexpr int operator"" _fp (long double n) {
return fixed_point<4>(n);
}
}
}
template<int prec> struct fp;
template<> struct fp<2> {
namespace lit = fp2;
};
template<> struct fp<4> {
namespace lit = fp4;
};
int main() {
{
using namespace fp<2>::lit;
std::cout << 5.421_fp << std::endl; // should output 542
}
{
using namespace fp<4>::lit;
std::cout << 5.421_fp << std::endl; // should output 54210
}
}
However, it doesn't compile because namespace aliases are not permitted at class scope. (It also has a problem with requiring you t manually define every version of operator"" _fp.) So I decided to try something with macros:
namespace fp {
namespace detail {
constexpr int floor(long double n) {
return n;
}
constexpr int pow10(int exp) {
return exp == 0 ? 1 : 10 * pow10(exp - 1);
}
template<int i>
constexpr int fixed_point(long double n) {
return floor(n * pow10(i));
}
}
}
#define SPEC(i) \
namespace fp { \
namespace precision##i { \
constexpr int operator"" _fp(long double n) { \
return fp::detail::fixed_point<i>(n); \
} \
} \
}
SPEC(2); SPEC(4);
#undef SPEC
#define fp_precision(i) namespace fp::precision##i
int main() {
{
using fp_precision(2);
std::cout << 5.421_fp << std::endl;
}
{
using fp_precision(4);
std::cout << 5.421_fp << std::endl;
}
}
This works, though it still has the requirement of using the SPEC() macro for every precision you ever want to use. Of course, some preprocessor trickery could be used to do this for every value from, say, 0 to 100, but I'm wondering if there could be anything more like a template solution, where each one is instantiated as it is needed. I had a vague idea of using an operator"" declared as a friend function in a template class, though I suspect that won't work either.
As a note, I did try template<int i> constexpr int operator"" _fp(long double n), but it seems this is not an allowed declaration of a literal operator.

You can return a class type that has operator()(int) overloaded from your literal operator. Then you could write
5.421_fp(2);

A user-defined literal function takes as its sole argument the literal itself. You can use state outside the function, for example with a global or thread-local variable, but that's not very clean.
If the argument will always be compile-time constant, and it's part of the number, pass it through the literal itself. That requires writing an operator "" _ ( char const *, std::size_t ) overload or template< char ... > operator "" _ () template and parsing the number completely by yourself.
You will have to work such a parameter into the existing floating-point grammar, though. Although C++ defines a very open-ended preprocessing-number construct, a user-defined literal must be formed from a valid token with a ud-suffix identifier appended.
You might consider using strings instead of numbers, but then the template option goes away.

One does not need macros to solve the problem. Since the problem concerns processing literals numbers (e.g., integers or floating-point formatted numbers), one can use the template definition of the literal operator and template metaprogramming to do the job completely at compile-time.
To do your fixed-point literal conversions, you could use the integer literal operator with unsigned long long, e.g.,
some_type operator "" _fp(unsigned long long num)
{
// code
}
(or with long double with possible loss of precision) but this causes everything to happen at run-time.
C++11 in section 2.14.8 (User-defined Lierals [lex.ext]) within paragraphs 3 and 4 define literal operator variations including a template version for integer and floating-point literals! Unfortunately, paragraphs 5 and 6 do not define a template version for string and character literals. This means this technique will only work with integer and floating-point literals.
From C++11 section 2.14.8 the above _fp literal operator can therefore be written instead as:
template <char... Digits>
constexpr some_type operator "" _fp()
{
return process_fp<2, Digits...>::to_some_type();
}
e.g., where the 2 is a value from the int i template parameter from the OP and some_type is whatever the return type needs to be. Notice that template parameter is a char --not an int or some other number. Also notice that the literal operator has no arguments. Thus code like Digit - '0' is needed to get the numeric value to an integer value for that character. Moreover, Digits... will be processed in a left-to-right order.
Now one can use template metaprogramming with process_fp whose forward declaration would look like:
template <int i, char... Digits>
struct process_fp;
and would have a static constexpr method called to_some_type() to compute and return the desired, compile-time result.
One might also want a meaningful, simple example of this is done. Last year I wrote code (link below) that when used like this:
int main()
{
using namespace std;
const unsigned long long bits =
11011110101011011011111011101111_binary;
cout << "The number is: " << hex << bits << endl;
}
would convert the binary number 11011110101011011011111011101111 into an unsigned long long at compile-time and store it into bits. Full code and explanation of such using the template metaprogramming technique referred to above is provided in my blog entry titled, Using The C++ Literal Operator.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Ways to distinguish string literal and runtime generated string - c++

While I haven't tested this, I think if you just declare the function constexpr and compile with high optimization, the compiler will compute at compile time whenever possible. As a bonus, you don't need to write the code twice. On the other hand, you have to write it once in constexpr style.

Related

What is the proper definition for a constexpr function that take a character array?

Is it possible with C++20 to have a constexpr function return a tuple of types that have static constexpr array's with a value passed though a macro?

Strange syntax for passing a const char parameter to deduce length as template parameter. What is happening?

c++ constexpr pointer casting

Any way to make parameterized user defined literals?

Categories

Resources