Compile-time assert for string equality - c++

Is this possible to do using templates?
There are two string constants. They come from defines in different modules. They must be equal, or I shall raise compile-time error if they are not equal. Can I do this using templates?
#define MY_STRING "foo"
CompileAssertIfStringsNotEqual(MY_STRING, HIS_STRING);
P.S. I was deluded by assuming that "abc"[0] is constant expression. It is not. Weird omission in the language. It would be possibe had "abc"[0] been constand expression.

This is only possible with C++0x. No chance with C++03.
EDIT: Constexpr function for C++0x. The following works with GCC4.6, however the Standard is not explicit in allowing it, and a small wording tweak was and is being considered to make the spec allow it.
constexpr bool isequal(char const *one, char const *two) {
*one == *two && (!*one || isEqual(one + 1, two + 1));
}
static_assert(isequal("foo", "foo"), "this should never fail");
static_assert(!isequal("foo", "bar"), "this should never fail");
The compiler is required to track the reference to characters of the string literals already, throughout all the recursions. Just the final read from characters isn't explicitly allowed (if you squint, you can read it as being allowed, IMO). If your compiler doesn't want to accept the above simple version, you can make your macro declare arrays and then compare those
#define CONCAT1(A, B) A ## B
#define CONCAT(A, B) CONCAT1(A, B)
#define CHECK_EQUAL(A, B) \
constexpr char CONCAT(x1, __LINE__)[] = A, \
CONCAT(x2, __LINE__)[] = B; \
static_assert(isequal(CONCAT(x1, __LINE__), CONCAT(x2, __LINE__)), \
"'" A "' and '" B "' are not equal!")
That's definitely fine.
CHECK_EQUAL("foo", "foo"); /* will pass */
CHECK_EQUAL("foo", "bar"); /* will fail */
Note that CHECK_EQUAL can be used inside of functions. The FCD did make a change to allow constexpr functions to read from automatic arrays in their invocation substitution. See DR1197.

No. You can not. Not even with boost::mpl or the boost preprocessor library. Even if you were to stipulate that the compiler could coalesce all duplicate string references into the same pointer value at compile-time, the pointer value does not exist until link-time. What you want to implement happens at preprocessor time, and then is asserted at compile-time.

Related

Is it possible to convert a macro with stringizing operator to a constexpr?

I have written the following macro to mimic C#'s nameof operator but in C++/CLI:
#define nameof(x) (#x)
if (info == nullptr)
throw gcnew ArgumentNullException(nameof(info));
I have tried to convert this macro to a constexpr:
Either of these are wrong, they return the content of ToString:
template <typename T>
constexpr auto NAMEOF1(T value)
{
return """" + value + """";
}
template <typename T>
constexpr auto NAMEOF2(T value)
{
return System::String::Format("{0}", value);
}
Third attempt, using the typeid keyword:
template <typename T>
constexpr auto NAMEOF3(T value)
{
return gcnew System::String(typeid(value).name());
}
But it fails with the following error:
error C3185: 'typeid': used on managed type 'T', use 'T::typeid' instead
In short, not as easy as it sounds.
Question:
Is it possible to convert this nameof macro to a constexpr ?
(or should I just stick to good old #define ?)
No, it is not possible to obtain the name of an object in C++ in a constexpr function. constexpr functions are still able to be called at runtime, and at runtime, there is no possibility for this kind of reflection. Indeed, reflection in C++ in general is extremely limited. Without macros, your reflection abilities are limited to testing for the existence of and type of members using dirty and unintuitive template trickery, but nothing that returns a name as a string. Also be aware that as soon as an argument is passed to a function like NAMEOF1, the name of that argument will always be "value". There is no way to query the scope of the caller for the name or expression that was passed.
You should stick with your solution:
#define nameof(x) (#x)
This is of course, also a very limiting solution. This will simply turn the expression x into a string perfectly verbatim, and has no concept of different entities and scoping like nameof in C#.
I say this coming from a non-managed C++ background. From the error message you receive, it's clear that managed C++ has interesting effects on typeid. Perhaps somebody who knows more about C++/CLI can enlighten me or give a better answer.
Alternatively, if this sort of null-check is a frequent pattern, you could wrap that into a more concise macro. For example:
define THROW_IF_NULL(x) \
if ((x) == nullptr){ \
throw gcnew ArgumentNullException(#x); \
}
Example usage:
void process_info(Info* info){
THROW_IF_NULL(info)
// --snip--
}
On a side not, I'm not sure what you expect this to do:
return """" + value + """";
but if my understanding is correct, the expression """" will be parsed as two empty string literals ("", ""), which, because they appear side-by-side, are appended by the preprocessor into a single empty string literal. You can thus replace """" with "" to achieve the same effect. But perhaps you wanted something else?

c++ macro recognizing tokens as arguments

So, it's been a while since I have written anything in C++ and now I'm working on a project using C++11 and macros.
I know that by using the stringify operator I can do this:
#define TEXT(a) #a //expands to "a"
How am I supposed to use the preprocessor for recognizing the tokens like + and * to do this:
#define TEXT(a)+ ??? //want to expand to "a+"
#define TEXT(a)* ??? //want to expand to "a*"
when the input has to be in that syntax?
I have tried doing that:
#define + "+"
but of course it doesn't work. How can I make the preprocessor recognize those tokens?
NOTE:
This is actually part of a project for a small language that defines and uses regular expressions, where the resulting string of the macros is to be used in a regex. The syntax is given and we have to use it as it is without making any changes to it.
eg
TEXT(a)+ is to be used to make the regular expression: std::regex("a+")
without changing the fact that TEXT(a) expands to "a"
First,
#define TEXT(a) #a
doesn't “convert to "a"”. a is just a name for a parameter. The macro expands to a string that contains whatever TEXT was called with. So TEXT(42 + rand()) will expand to "42 + rand()". Note that, if you pass a macro as parameter, the macro will not be expanded. TEXT(EXIT_SUCCESS) will expand to "EXIT_SUCCESS", not "0". If you want full expansion, add an additional layer of indirection and pass the argument to TEXT to another macro TEXT_R that does the stringification.
#define TEXT_R(STUFF) # STUFF
#define TEXT(STUFF) TEXT_R(STUFF)
Second, I'm not quite sure what you mean with TEXT(a)+ and TEXT(a)*. Do you want, say, TEXT(foo) to expand to "foo+"? I think the simplest solution in this case would be to use the implicit string literal concatenation.
#define TEXT_PLUS(STUFF) # STUFF "+"
#define TEXT_STAR(STUFF) # STUFF "*"
Or, if you want full expansion.
#define TEXT_R(STUFF) # STUFF
#define TEXT_PLUS(STUFF) TEXT_R(STUFF+)
#define TEXT_STAR(STUFF) TEXT_R(STUFF*)
Your assignment is impossible to solve in C++. You either misunderstood something or there’s an error in the project specification. At any rate, we’ve got a problem here:
TEXT(a)+ is to be used to make the regular expression: std::regex("a+") without changing the fact that TEXT(a) expands to "a" [my emphasis]
TEXT(a) expands to "a" — meaning, we can just replace TEXT(a) everywhere in your example; after all, that’s exactly what the preprocessor does. In other words, you want the compiler to transform this C++ code
"a"+
into
std::regex("a+")
And that’s simply impossible, because the C++ preprocess does not allow expanding the + token.
The best we can do in C++ is use operator overloading to generate the desired code. However, there are two obstacles:
You can only overload operators on custom types, and "a" isn’t a custom type; its type is char const[2] (why 2? Null termination!).
Postfix-+ is not a valid C++ operator and cannot be overloaded.
If your assignment had just been a little different, it would work. In fact, if your assignment had said that TEXT(a)++ should produce the desired result, and that you are allowed to change the definition of TEXT to output something other than "a", then we’d be in business:
#include <string>
#include <regex>
#define TEXT(a) my_regex_token(#a)
struct my_regex_token {
std::string value;
my_regex_token(std::string value) : value{value} {}
// Implicit conversion to `std::regex` — to be handled with care.
operator std::regex() const {
return std::regex{value};
}
// Operators
my_regex_token operator ++(int) const {
return my_regex_token{value + "+"};
}
// more operators …
};
int main() {
std::regex x = TEXT(a)++;
}
You don't want to jab characters onto the end of macros.
Maybe you simply want something like this:
#define TEXT(a, b) #a #b
that way TEXT(a, +) gets expanded to "a" "+" and TEXT(a, *) to "a" "*"
If you need that exact syntax, then use a helper macro, like:
#define TEXT(a) #a
#define ADDTEXT(x, y) TEXT(x ## y)
that way, ADDTEXT(a, +) gets expanded to "a+" and ADDTEXT(a, *) gets expanded to "a*"
You can do it this way too:
#define TEXT(a) "+" // "a" "+" -> "a+"
#define TEXT(a) "*" // "a" "*" -> "a*"
Two string literals in C/C++ will be joined into single literal by specification.

Looking for a MACRO adding two characters together for switch cases

I'm working on an old network engine and the type of package sent over the network is made up of 2 bytes.
This is more or less human readable form, for example "LO" stands for Login.
In the part that reads the data there is an enormous switch, like this:
short sh=(((int)ad.cData[p])<<8)+((int)ad.cData[p+1]);
switch(sh)
{
case CMD('M','D'):
..some code here
break
where CMD is a define:
#define CMD(a,b) ((a<<8)+b)
I know there are better ways but just to clean up a bit and also to be able to search for the tag (say "LO") more easily (and not search for different types of "'L','O'" or "'L' , 'O'" or the occasional "'L', 'O'" <- spaces make it hard to search) I tried to make a MACRO for the switch so I could use "LO" instead of the define but I just can't get it to compile.
So here is the question: how do you change the #define to a macro that I can use like this instead:
case CMD("MD"):
..some code here
break
It started out as a little subtask to make life a little bit easier but now I can't get it out of my head, thanks for any help!
Cheers!
[edit] The code works, it the world that's wrong! ie. Visual Studio 2010 has a bug concerning this. No wonder I cut my teeth on it.
Macro-based solution
A string-literal is really an instance of char const[N] where N is the length of the string, including the terminating null-byte. With this in mind you can easily access any character within the string-literal by using string-literal[idx] to specify that you'd like to read the character stored at offset idx.
#define CMD(str) ((str[0]<<8)+str[1])
CMD("LO") => (("LO"[0]<<8)+"LO"[1]) => (('L'<<8)+'0')
You should however keep in mind that there's nothing preventing your from using the above macro with a string which is shorter than that of length 2, meaning that you can run into undefined-behavior if you try to read an offset which is not actually valid.
RECOMMENDED: C++11, use a constexpr function
You could create a function usable in constant-expressions (and with that, in case-labels), with a parameter of reference to const char[3], which is the "real" type of your string-literal "FO".
constexpr short cmd (char const(&ref)[3]) {
return (ref[0]<<8) + ref[1];
}
int main () {
short data = ...;
switch (data) {
case cmd("LO"):
...
}
}
C++11 and user-defined literals
In C++11 we were granted the possibility to define user-defined literals. This will make your code far easier to maintain and interpret, as well as having it be safer to use:
#include <stdexcept>
constexpr short operator"" _cmd (char const * s, unsigned long len) {
return len != 2 ? throw std::invalid_argument ("") : ((s[0]<<8)+s[1]);
}
int main () {
short data = ...;
switch (data) {
case "LO"_cmd:
...
}
}
The value associated with a case-label must be yield through a constant-expression. It might look like the above might throw an exception during runtime, but since a case-label is constant-expression the compiler must be able to evaluate "LO"_cmd during translation.
If this is not possible, as in "FOO"_cmd, the compiler will issue a diagnostic saying that the code is ill-formed.

For identical static_assert messages, should I rely on MACROS?

static_assert has the following syntax, which states that a string literal is required.
static_assert ( bool_constexpr , string literal );
Since an instance of a string CAN'T be observed at compile time, the following code is invalid:
const std::string ERROR_MESSAGE{"I assert that you CAN NOT do this."};
static_assert(/* boolean expression */ ,ERROR_MESSAGE);
I have static asserts all over my code, which say the same error message.
Since a string literal is required, would it be best to replace all the repetitive string literals with a MACRO, or is there a better way?
// Is this method ok?
// Should I hand type them all instead?
// Is there a better way?
#define _ERROR_MESSAGE_ "danger"
static_assert(/* boolean expression 1*/ ,_ERROR_MESSAGE_);
//... code ...
static_assert(/* boolean expression 2*/ ,_ERROR_MESSAGE_);
//... code ...
static_assert(/* boolean expression 3*/ ,_ERROR_MESSAGE_);
In C++ you should not define a constant as a macro. Define it as a constant. That's what constants are for.
Also, names beginning with underscore followed by uppercase letter, such as your _ERROR_MESSAGE_, are reserved to the implementation.
That said, yes, it's good idea to use a macro for static asserts, both to ensure a correct string argument and to support compilers that possibly don't have static_assert, but this macro is not C style constant: it takes the expression as argument, and provides that expression as the string message.
Here is my current <static_assert.h>:
#pragma once
// Copyright (c) 2013 Alf P. Steinbach
// The "..." arguments permit template instantiations with "<" and ">".
#define CPPX_STATIC_ASSERT__IMPL( message_literal, ... ) \
static_assert( __VA_ARGS__, "CPPX_STATIC_ASSERT: " message_literal )
#define CPPX_STATIC_ASSERT( ... ) \
CPPX_STATIC_ASSERT__IMPL( #__VA_ARGS__, __VA_ARGS__ )
// For arguments like std::integral_constant
#define CPPX_STATIC_ASSERT_YES( ... ) \
CPPX_STATIC_ASSERT__IMPL( #__VA_ARGS__, __VA_ARGS__::value )
As you can see there are some subtleties involved even when the compiler does have static_assert.

Encrypting / obfuscating a string literal at compile-time

I want to encrypt/encode a string at compile time so that the original string does not appear in the compiled executable.
I've seen several examples but they can't take a string literal as argument. See the following example:
template<char c> struct add_three {
enum { value = c+3 };
};
template <char... Chars> struct EncryptCharsA {
static const char value[sizeof...(Chars) + 1];
};
template<char... Chars>
char const EncryptCharsA<Chars...>::value[sizeof...(Chars) + 1] = {
add_three<Chars>::value...
};
int main() {
std::cout << EncryptCharsA<'A','B','C'>::value << std::endl;
// prints "DEF"
}
I don't want to provide each character separately like it does. My goal is to pass a string literal like follows:
EncryptString<"String to encrypt">::value
There's also some examples like this one:
#define CRYPT8(str) { CRYPT8_(str "\0\0\0\0\0\0\0\0") }
#define CRYPT8_(str) (str)[0] + 1, (str)[1] + 2, (str)[2] + 3, (str)[3] + 4, (str)[4] + 5, (str)[5] + 6, (str)[6] + 7, (str)[7] + 8, '\0'
// calling it
const char str[] = CRYPT8("ntdll");
But it limits the size of the string.
Is there any way to achieve what I want?
I think this question deserves an updated answer.
When I asked this question several years ago, I didn't consider the difference between obfuscation and encryption. Had I known this difference then, I'd have included the term Obfuscation in the title before.
C++11 and C++14 have features that make it possible to implement compile-time string obfuscation (and possibly encryption, although I haven't tried that yet) in an effective and reasonably simple way, and it's already been done.
ADVobfuscator is an obfuscation library created by Sebastien Andrivet that uses C++11/14 to generate compile-time obfuscated code without using any external tool, just C++ code. There's no need to create extra build steps, just include it and use it. I don't know a better compile-time string encryption/obfuscation implementation that doesn't use external tools or build steps. If you do, please share.
It not only obuscates strings, but it has other useful things like a compile-time FSM (Finite State Machine) that can randomly obfuscate function calls, and a compile-time pseudo-random number generator, but these are out of the scope of this answer.
Here's a simple string obfuscation example using ADVobfuscator:
#include "MetaString.h"
using namespace std;
using namespace andrivet::ADVobfuscator;
void Example()
{
/* Example 1 */
// here, the string is compiled in an obfuscated form, and
// it's only deobfuscated at runtime, at the very moment of its use
cout << OBFUSCATED("Now you see me") << endl;
/* Example 2 */
// here, we store the obfuscated string into an object to
// deobfuscate whenever we need to
auto narrator = DEF_OBFUSCATED("Tyler Durden");
// note: although the function is named `decrypt()`, it's still deobfuscation
cout << narrator.decrypt() << endl;
}
You can replace the macros DEF_OBFUSCATED and OBFUSCATED with your own macros. Eg.:
#define _OBF(s) OBFUSCATED(s)
...
cout << _OBF("klapaucius");
How does it work?
If you take a look at the definition of these two macros in MetaString.h, you will see:
#define DEF_OBFUSCATED(str) MetaString<andrivet::ADVobfuscator::MetaRandom<__COUNTER__, 3>::value, andrivet::ADVobfuscator::MetaRandomChar<__COUNTER__>::value, Make_Indexes<sizeof(str) - 1>::type>(str)
#define OBFUSCATED(str) (DEF_OBFUSCATED(str).decrypt())
Basically, there are three different variants of the MetaString class (the core of the string obfuscation). Each has its own obfuscation algorithm. One of these three variants is chosen randomly at compile-time, using the library's pseudo-random number generator (MetaRandom), along with a random char that is used by the chosen algorithm to xor the string characters.
"Hey, but if we do the math, 3 algorithms * 255 possible char keys (0 is not used) = 765 variants of the obfuscated string"
You're right. The same string can only be obfuscated in 765 different ways. If you have a reason to need something safer (you're paranoid / your application demands increased security) you can extend the library and implement your own algorithms, using stronger obfuscation or even encryption (White-Box cryptography is in the lib's roadmap).
Where / how does it store the obfuscated strings?
One thing I find interesting about this implementation is that it doesn't store the obfuscated string in the data section of the executable.
Instead, it is statically stored into the MetaString object itself (on the stack) and the algorithm decodes it in place at runtime. This approach makes it much harder to find the obfuscated strings, statically or at runtime.
You can dive deeper into the implementation by yourself. That's a very good basic obfuscation solution and can be a starting point to a more complex one.
Save yourself a heap of trouble down the line with template metaprogramming and just write a stand alone program that encrypts the string and produces a cpp source file which is then compiled in. This program would run before you compile and would produce a cpp and/or header file that would contain the encrypted string for you to use.
So here is what you start with:
encrypted_string.cpp and encrypted_string.h (which are blank)
A script or standalone app that takes a text file as an input and over writes encrypted_string.cpp and encrypted_string.h
If the script fails, your compiling will fail because there will be references in your code to a variable that does not exist. You could get smarter, but that's enough to get you started.
The reason why the examples you found can't take string literals as template argument is because it's not allowed by the ISO C++ standard. That's because, even though c++ has a string class, a string literal is still a const char *. So, you can't, or shouldn't, alter it (leads to undefined behaviour), even if you can access the characters of such an compile-time string literal.
The only way I see is using defines, as they are handled by the preprocessor before the compiler. Maybe boost will give you a helping hand in that case.
A macro based solution would be to take a variadic argument and pass in each part of the string as a single token. Then stringify the token and encrypt it and concatenate all tokens. The end result would look something like this
CRYPT(m y _ s t r i n g)
Where _ is some placeholder for a whitespace character literal. Horribly messy and I would prefer every other solution over this.
Something like this could do it although the Boost.PP Sequence isn't making it any prettier.
#include <iostream>
#include <boost/preprocessor/stringize.hpp>
#include <boost/preprocessor/seq/for_each.hpp>
#define GARBLE(x) GARBLE_ ## x
#define GARBLE_a x
#define GARBLE_b y
#define GARBLE_c z
#define SEQ (a)(b)(c)
#define MACRO(r, data, elem) BOOST_PP_STRINGIZE(GARBLE(elem))
int main() {
const char* foo = BOOST_PP_SEQ_FOR_EACH(MACRO, _, SEQ);
std::cout << foo << std::endl;
}