Reverse preprocessor stringizing operator - c++

There is a lot of wide string numeric constants defined in one include file in one SDK, which I cannot modify, but which gets often updated and changed. So I cannot declare the numeric define with the numbers because It is completely different each few days and I don't want ('am not allowed) to apply any scripting for updating
If it would be the other way round and the constant would be defined as a number, I can simply make the string by # preprocessor operator.
I don't won't to use atoi and I don't want to make any variables, I just need the constants in numeric form best by preprocessor.
I know that there is no reverse stringizing operator, but isn't there any way how to convert string to token (number) by preprocessor?

There is no way to "unstringify" a string in the preprocessor. However, you can get, at least, constant expressions out of the string literals using user-defined literals. Below is an example initializing an enum value with the value taken from a string literal to demonstrate that the decoding happens at compile time, although not during preprocessing:
#include <iostream>
constexpr int make_value(int base, wchar_t const* val, std::size_t n)
{
return n? make_value(base * 10 + val[0] - L'0', val + 1, n -1): base;
}
constexpr int operator"" _decode(wchar_t const* val, std::size_t n)
{
return make_value(0, val, n);
}
#define VALUE L"123"
#define CONCAT(v,s) v ## s
#define DECODE(d) CONCAT(d,_decode)
int main()
{
enum { value = DECODE(VALUE) };
std::cout << "value=" << value << "\n";
}

Related

Stringize operator on argument with spaces

I have
#define ARG(TEXT , REPLACEMENT) replace(#TEXT, REPLACEMENT)
so
QString str= QString("%ONE, %TWO").ARG(%ONE, "1").ARG(%TWO, "2");
becomes
str= QString("%ONE, %TWO").replace("%ONE", "1").replace("%TWO", "2");
//str = "1, 2"
The problem is that VS2019, when formatting the code (Edit.FormatSelection) interprets that % sign as an operator and adds a whitespace
QString str= QString("%ONE, %TWO").ARG(% ONE, "1").ARG(% TWO, "2");
(I think it's a bug in VS). The code compiles without warnings.
As I am dealing with some ancient code that has this "feature" spread, I'm worried to auto-format text containing this and break functionality.
Is there a way at compile time to detect such arguments to a macro having space(s)?
Is there a way at compile time to detect such arguments to a macro having space(s)?
Here's what I would do:
#define ARG(TEXT, REPLACEMENT) \
replace([]{ \
static constexpr char x[] = #TEXT; \
static_assert(x[0] == '%' && x[1] != ' '); \
return x; \
}(), REPLACEMENT)
Apparently some time in the next decade C++ will provide a better solution, and indeed there might be a much less clunky solution than the one I provide below, but it's maybe a place to start.
This version uses the Boost Preprocessor library to do a repetition which would have been straight-forward to write with a template if C++ allowed string literals as template arguments, a feature which has not yet gotten into the standard for motivations I can only guess at. So it doesn't actually test whether the argument has no spaces; rather it tests that there are no spaces in the first 64 characters (where 64 is an almost entirely arbitrary number which can be changed as your needs dictate). I used the Boost Preprocessor library; you could do this with your own special purpose macros if for some reason you don't want to use Boost.
#include <boost/preprocessor/repetition/repeat.hpp>
#define NO_SPACE_AT_N(z, N, s) && (N >= sizeof(s) || s[N] != ' ')
#define NO_SPACE(s) true BOOST_PP_REPEAT(64, NO_SPACE_AT_N, s)
// Arbitrary constant, change as needed---^
// Produce a compile time error if there's a space.
template<bool ok> struct NoSpace {
const char* operator()(const char* s) {
static_assert(ok, "Unexpected space");
return s;
}
};
#define ARG(TEXT, REPL) replace(NoSpace<NO_SPACE(#TEXT)>()(#TEXT), REPL)
(Test on gcc.godbolt.)
If the question is to produce a compilation error when the first argument of ARG contains a space, I managed to get this to work:
#include <cstdlib>
template<size_t N>
constexpr int string_validate( const char (&s)[N] )
{
for (int i = 0; i < N; ++i)
if ( s[i] == ' ' )
return 0;
return 1;
}
template<int N> void assert_const() { static_assert(N, "string validation failed"); }
int replace(char const *, char const *) { return 0; } // dummy for example
#define ARG(TEXT , REPLACEMENT) replace((assert_const<string_validate(#TEXT)>(), #TEXT), REPLACEMENT)
int main()
{
auto b = ARG(%TWO, "2");
auto a = ARG(% ONE, "1"); // causes assertion failure
}
Undoubtedly there is a shorter way. Prior to C++20 you can't use a string literal in a template parameter, hence the constexpr function to produce an integer from the string literal and then we can check the integer at compile-time by using it as a template parameter.
It's unlikely.
Visual Studio works on source code, without running the preprocessor first and without performing what would be quite a difficult computation to work out whether the preprocessor would fundamentally alter the line it's formatting.
Besides, people don't really use macros in this way any more, or shouldn't (we have cheap functions!).
So this isn't really what the formatting feature expects.
If you can modify the code, make the user write .ARG("%ONE", "1"), then not only does the problem go away but also the would be more consistent.
Otherwise, you'll have to stick with formatting the code by hand.

User defined literals definitions

I was taking a look at the cppreference page for user defined literals, and I think I understand everything except a few examples
template <char...> double operator "" _π(); // OK
How does this operator work? How can you call it?
double operator"" _Z(long double); // error: all names that begin with underscore
// followed by uppercase letter are reserved
double operator""_Z(long double); // OK: even though _Z is reserved ""_Z is allowed
What is the difference between the above two functions? What would be the difference in calling the first function as opposed to the second if the first were not an error?
Thanks!
template <char...> double operator "" _π(); // OK
How does this operator work? How can you call it?
1.234_π will call operator "" _π<'1', '.', '2', '3', '4'>(). This form allows you to detect differences in spelling that would ordinarily be undetectable (1.2 vs 1.20, for example), and allows you to avoid rounding issues due to 1.2 not being exactly representable in even long double.
double operator"" _Z(long double); // error: all names that begin with underscore
// followed by uppercase letter are reserved
double operator""_Z(long double); // OK: even though _Z is reserved ""_Z is allowed
What is the difference between the above two functions?
The C++ standard defines the grammar in terms of tokens, which you can sort of interpret as words. "" _Z is two tokens, "" and _Z. ""_Z is a single token.
This distinction matters: given #define S " world!", and then "Hello" S, the whitespace is what makes S a standalone token, preventing it from being seen as a user-defined literal suffix.
For easier coding, both "" _Z and ""_Z syntaxes are generally allowed when defining these functions, but the "" _Z syntax requires _Z to be seen as an identifier. This can cause problems when an implementation predefines _Z as a macro, or declares it as a custom keyword.
As far as I understand there is not difference between the two signitures.
The issue is that the identifier _Z is technically reserved by the standard. The main difference is that there is a space:
double operator""/*space*/_Z(long double);
double operator""_Z(long double);
Removing the space is basically a workaround that in theory would suppress the error (or more likely a warning).
As far as how you use them, did you look at the examples from the link you listed?
#include <iostream>
// used as conversion
constexpr long double operator"" _deg ( long double deg )
{
return deg*3.141592/180;
}
// used with custom type
struct mytype
{
mytype ( unsigned long long m):m(m){}
unsigned long long m;
};
mytype operator"" _mytype ( unsigned long long n )
{
return mytype(n);
}
// used for side-effects
void operator"" _print ( const char* str )
{
std::cout << str;
}
int main(){
double x = 90.0_deg;
std::cout << std::fixed << x << '\n';
mytype y = 123_mytype;
std::cout << y.m << '\n';
0x123ABC_print;
}
The idea behind the user defined literals is to allow the creation of an operator that can be applied to built in types that can convert the built in literal to another type.
EDIT:
To call one of these operators you just need to append the operator as a suffix to a value literal. So given:
// used as conversion
constexpr long double operator"" _deg ( long double deg )
{
return deg*3.141592/180;
}
The calling code could be for example:
long double d = 45_deg;
As far as using template <char...> double operator "" _π(); Maybe take a look at this.

Preprocessor: convert macro parameter to multicharacter literal

In a nutshell:
It's possible to convert macro parameter to character string literal (that contains the spelling of the preprocessing token sequence for the corresponding
argument, regarding to 16.3.2, n3797) with # operator. Is there a way, some trick, maybe, to convert macro parameter to multibyte character ?
So, for example, I will have some macro TO_CHAR(WHATEWER) and TO_CHAR(CONSTANT) will give me 'CONSTANT' multibyte character ?
I need this to work with boost::mpl::string. More precisely, I think so:
I have a lot of constants and I work with them with boost::mpl::vector (boost::mpl::vector_c). Now, I need to print out the name of constant after some work with this constant. Pseudo-pseudo code:
struct Temp
{
template<typename Value>
operator()(Value)
{
// ...
std::cout << "The value of " << #Value::value Name << " is " << Value::value;
};
};
typedef boost::mpl::vector_c<int, MY_CONST1, MY_CONST2> Constants;
boost::mpl::for_each<Constants>(Temp());
The output must be something like this:
The value of MY_CONST1 is 1
The value of MY_CONST2 is 2
I don't want to make some runtime array with a lot of typing:
typedef boost::mpl::vector_c<int, MY_CONST1, MY_CONST2> Constants;
const char* constant_names[] = { "MY_CONST1", "MY_CONST2" };
// Work with this stuff
What I want to have is something like this:
#define MAP_CONST(CONST) boost::mpl::pair<boost::mpl::int_<CONST>, (boost::mpl::string maybe here) #CONST>
typedef boost::mpl::vector<MAP_CONST(MY_CONST1), MAP_CONST(MY_CONST2)> Constants;
// Work with #Constants
Btw, I can't use constexp or new standard in general - I have C++03 compiler.
Thanks
UPD: Sorry for my mismatch - I mean not multibyte character, but "multicharacter literal" (regarding standard - 2.14.3, n3797). Somrthing like this

C++ integer template parameter evaluation

This question may be very straightforward but I am rather inexperienced with c++ and got stuck while writing a simple parser.
For some reason one of the string comparison functions would not return the expected value when called.
The function looks like this:
template<int length>
bool Parser::compare(const char *begin, const char *str){
int i = 0;
while(i != length && compareCaseInsensitive(*begin, *str)){
i++;
begin++;
str++;
}
return i == length;
};
The purpose of this function was to compare a runtime character buffer with a compile time constant string vb
compare<4>(currentByte, "<!--");
I know there are more efficient ways to compare a fixed length character buffer (and used one later on) but I was rather puzzled when I ran this function and it always returns false, even with two identical strings.
I checked with the debugger and checked the value of i at the end of the loop and it was equal to the value of the template parameter but still the return expression evaluated to false.
Are there any special rules about working with int template parameters ?
I assumed the template parameter would behave like a compile time constant.
I don't know if this is relevant but I'm running gcc's g++ compiler and debugged with gdb.
If anyone could tell me what might cause this problem it would be highly appreciated.
The functions used in this piece of code:
template<typename Character>
Character toLowerCase(Character c){
return c > 64 && c < 91 ? c | 0x10 : c;
};
template<typename Character>
bool equalsCaseInsensitive(Character a, Character b){
return toLowerCase(a) == toLowerCase(b);
};
For doing case-insensitive string comparisons, I would try using the STL function std::strcoll from the header <cstring> which has signature
int strcoll( const char* lhs, const char* rhs );
and compares two null-terminated byte strings according to the current locale. Or if you want to roll your own, you could still use std::tolower from the header <cctype> which has signature
int tolower( int ch );
and converts the given character to lowercase according to the character conversion rules defined by the currently installed C locale.

compile-time string hashing

I need to use a string as the ID to obtain some object. At implement this in a run-time, and works well. But this makes the static type checking impossible, for obvious reasons.
I've Googled for the algorithm for calculating the hash-sum of string in the compile-time: C++ compile-time string hashing with Boost.MPL.
It seems to be the perfect solution to my problem, except that the sring which is necessary to the algorithm should be split into pieces by 4 characters, or character-by-character, as well, for obvious reasons.
i.e., instead of the usual current record of the ID's, I'll have to write this way:
hash_cstring<boost::mpl::string<'obje', 'ct.m', 'etho', 'd'>>::value
This is absolutely unusable.
The question is, how to pass correctly the string such as "object.method" to this algorithm?
Thank you all.
Solution with gcc-4.6:
#include <iostream>
template<size_t N, size_t I=0>
struct hash_calc {
static constexpr size_t apply (const char (&s)[N]) {
return (hash_calc<N, I+1>::apply(s) ^ s[I]) * 16777619u;
};
};
template<size_t N>
struct hash_calc<N,N> {
static constexpr size_t apply (const char (&s)[N]) {
return 2166136261u;
};
};
template<size_t N>
constexpr size_t hash ( const char (&s)[N] ) {
return hash_calc<N>::apply(s);
}
int main() {
char a[] = "12345678";
std::cout << std::hex << hash(a) << std::endl;
std::cout << std::hex << hash("12345678") << std::endl;
}
http://liveworkspace.org/code/DPObf
I`m happy!
I don't know of a way to do this with the preprocessor or with templates. I suspect your best bet is to create a separate pre-compile step (say with perl or such) to generate the hash_cstring statements from a set of source statements. Then at least you don't have to split the strings manually when you add new ones, and the generation is fully automated and repeatable.
Templates can be instantiated with any external symbol, therefore this should work as expected:
external char const* object_method = "object.method";
... = hash_cstring<object_method>::value;
(given the template hash_cstring<> is able to deal with pointer values).
In case anyone is interested, I walk through how to create a compile time hash of Murmur3_32 using C++11 constexpr functions and variadic templates here:
http://roartindon.blogspot.sg/2014/10/compile-time-murmur-hash-in-c.html
Most of the examples I've seen deal with hashes that are based on consuming one character of the string at a time. The Murmur3_32 hash is a bit more interesting in that it consumes 4 characters at a time and needs some special case code to handle the remaining 0, 1, 2 or 3 bytes.