C++ Compile-Time string manipulation - c++

I looked at boost's mpl::string, but there doesn't seem to be an easy way of converting string literals to the single-quotation-integer-based format of mpl::string. What I am trying to do is to generate at compile time an XML realization of some simple data structures using compile time strings. I am striving for having macros generate the structures themselves and insert a constant "meta" field inside them, containing said XML string.

The short answer is no, there is no easy way. At least not using C++ alone, and at compile time. You can use scripts or some other code generator to produce mpl::strings with the correct literals. C++0x will bring user defined literals [1], that allow an easy manipulation of literals, character by character, for example, using variadic templates.
http://en.wikipedia.org/wiki/C%2B%2B0x#User-defined_literals

Here is an article regarding the subject: http://akrzemi1.wordpress.com/2011/05/11/parsing-strings-at-compile-time-part-i/. The author implements a simple RPN arithmetic calculator that works during compile-time using user string literals and constexpr. I won't attempt to provide any more summary of the article here.

Related

How can I replicate compile time hex string interpretation at run time!? c++

In my code the following line gives me data that performs the task its meant for:
const char *key = "\xf1`\xf8\a\\\x9cT\x82z\x18\x5\xb9\xbc\x80\xca\x15";
The problem is that it gets converted at compile time according to rules that I don't fully understand. How does "\x" work in a String?
What I'd like to do is to get the same result but from a string exactly like that fed in at run time. I have tried a lot of things and looked for answers but none that match closely enough for me to be able to apply.
I understand that \x denotes a hex number. But I don't know in which form that gets 'baked out' by the compiler (gcc).
What does that ` translate into?
Does the "\a" do something similar to "\x"?
This is indeed provided by the compiler, but this part is not member of the standard library. That means that you are left with 3 ways:
dynamically write a C++ source file containing the string, and writing it on its standard output. Compile it and (providing popen is available) execute it from your main program and read its input. Pretty ugly isn't it...
use the source of an existing compiler, or directly its internal libraries. Clang is probably a good starting point because it has been designed to be modular. But it could require a good amount of work to find where that damned specific point is coded and how to use that...
just mimic what the compiler does, and write your own parser by hand. It is not that hard, and will learn you why tests are useful...
If it was not clear until here, I strongly urge you to use the third way ;-)
If you want to translate "escape" codes in strings that you get as input at run-time then you need to do it yourself, explicitly.
One way is to read the input into one string. Then copy the characters from that source string into a new destination string, one by one. If you see a backslash then you discard it, fetch the next character, and if it's an x you can use e.g. std::stoi to convert the next few characters into its corresponding integer value, and append that number to the destination string (either adding it with std::to_string, or using output string streams and the normal "output" operator <<).

How to test if a string is a valid C++(ish) expression?

I am writing a program in C++ that needs to be able to test if a string (probably std::string) is a valid C++ expression. Variables can be checked if they have been declared (bool variableDeclared(std::string identifier)) and their type can also be checked (std::string variableType(std::string identifier)). The variableType function returns a string based on how it would be declared in C++ ("bool", "double", "char", etc).
The expression doesn't need to be evaluated but only tested to see if it is valid. The function only needs to support character literals, string literals, number literals, brackets, simple operators (+, -, *, /, ! (logic not), &&, ||, >, <, ==), and variables of type double, std::string (no function calls needed), bool and char. It is also not required to support string concatenation.
The desired result would be a function that is something like bool validExpression(std::string expression). It is also preferable that it allows me to modify the operations (for example I could change "==" to "equal-to").
How would I implement this? Is there a library that could do something like this, a regex statement or is it simply a matter of a long function with lots of if statements?
Formally, your situation is: you have a grammar which describes the language of expressions which you want to validate, and a word for which you want to determine whether it belongs in that language. This is a job for a parser of that language.
You could hand-cook something like a recursive-descent LL(1) parser, or use a tool to generate a parser. A well-known example of such a tool is Bison for generating LALR(1) parsers. Wikipedia has a long parser generator list.
Technical terms are used above mainly to provide entry points for googling.
You would start from defining your language more or less formally. (A language is a set of strings). A good way to define a language is to specify its context-free grammar. Describe additional conditions (like the requirement that variables must be declared, and of the right type) informally in prose.
The next step would be building a parser for your grammar specified at the previous step. There are several tools for building parsers from grammars automatically, from yacc/bison to boost::spirit.
After building and checking the parser, implement the informally-specified rules and plug them into your parser code/data.
Normally the next step, building an evaluator, would probably the easiest part of writing a simple interpreter, but you say you don't need one.
Describing your language as "just like C++ only with certain bits taken out" could be a preliminary step to the sequence outlined above. It is however not recommended to start out from C++ if you can help it. C++ is an extremely hard language to specify formally, and its parsers tend to be rather hairy, due to its convoluted declaration syntax.
you can run compiler as sub-process of your application. All you have to do is to pass arguments and parse response properly

Choose default type and encoding for C++ string literals at compile time

C++11 introduced the new string literals for UTF-8, 16 and 32 with the u8, u and U prefixes but I have to hard code which one I want to use. I'm looking for a way to select which encoding I want to use at compile time (similar to how a typedef works).
User defined string literals don't seem to help as they work on the strings of the specified encoding.
I have seen in pre C++11 code the use of a short macro such as L("string") to choose between "string" and L"string" but personally I find that quite ugly.
Is it possible to neatly choose the default type and encoding or will I have to use the macro option?
Unfortunately the solution to this problem is to use the macros. Although #Nadim Farhat pointed out that you can do a certain amount of choosing with gcc it is by no means a portable solution.

Truly compile-time string hashing in C++

Basically I need a truly compile-time string hashing in C++. I don't care about technique specifics, can be templates, macros, anything. All other hashing techniques I've seen so far can only generate hashtable (like 256 CRC32 hashes) in compile time, not a real hash.
In other words, I need to have this
printf("%d", SOMEHASH("string"));
to be compiled as (in pseudo-assembler)
push HASHVALUE
push "%d"
call printf
even in Debug builds, with no runtime operations on string. I am using GCC 4.2 and Visual Studio 2008 and I need the solution to be OK for those compilers (so no C++0x).
The trouble is that in C++03 the result of subscripting a string literal (i.e. access a single character) is not a compile-time constant suitable for use as a template parameter.
It is therefore not possible to do this. I would recommend you to write a script to compute the hashes and insert them directly into the source code, i.e.
printf("%d", SOMEHASH("string"));
gets converted to
printf("%d", 257359823 /*SOMEHASH("string")*/ ));
Write your own preprocessor that scans the source for SOMEHASH("") and replaces it with the computed hash. Then pass the output of that to the compiler.
(Similar techniques are used for I18N.)
With templates only the following syntax will work:
SOMEHASH<'s','t','r','i','n','g'>
see this eg:
http://arcticinteractive.com/2009/04/18/compile-time-string-hashing-boost-mpl/
or
compile-time string hashing
You have to wait for user-defined literals in C++0x for this.
If you don't mind using the new C++0x standard in your code (some answers also include links to stuff that works in the older C++03 standard), these questions have been asked before on StackOverflow:
Compile-time (preprocessor) hashing of string
Compile time string hashing
Both of those contain answers that will help you figure out how to possibly implement this.
Here is a blog post that shows how to use Boost.MPL Compile Time String Hashing
That's not possible, it might be in C++0x but definitely not in C++03.

C++ Dynamically convert string to any basic type

In C++ I need to convert a string to any type at runtime where I do not know what type I might be getting in the string. I have heard there is a lexical_cast in boost that I can use, but what would be the most effective way to implement it?
I might get a bunch of string like this from a client: Date="25/08/2010", Someval="2", Blah="25.5".
Now I want to be able to convert these strings to their type, eg, the Somval is obviously an int, and the Date could be a boost::date or whatever. The point is, I don't know at this time in what order these would be given to me, so it's hard to write some code that will perform a bunch of casts.
I could use a bunch of if/else statements or a switch/case statements, however I'm thinking that there is possibly a better way to do this.
I'm not looking for something different to lexical_cast, I can totally use that, I am looking to see if someone knows a better way then doing this:
std::string str = "256";
int a = lexical_cast<int>(str);
//now check if the cast worked, if not, try another...
This is too much of a guessing game, and if I have 10 possible types, for any given string, it sounds a bit ineffective. Especially if it has to do 1000's of these at any given time.
Can anybody advice?
Alex Brown notes - the example string is a fragment of the XML data that comes from the client.
Use an XML parser to read XML data, it will do almost all of the legwork for you, and deal with the ordering issues. Then you simply need to ask the parser for the data you need for the calculation.
Details differ with different XML parsers - go find one, read the documentation. If you need more help, come back here with an XML parser question.
GMan is right, you can not cast an arbitrary string to for example a Date type if the underlaying data structure is different. You can, however, parse the content and instantiate a new object using the data in the string. std::atoi() parses a c-string to an int for example.
You need to parse the string, not cast it.
What you're describing is actually a parser. Even the trial-and-error approach using lexical_cast is really just a (crude) parser.
I suggest to clarify the format of the input string and then, if it's simple enough, write a Recursive descent parser by hand to parse the input string into whatever data structure is convenient for your need.
you could use a VARIANT type of struct (i.e. one of every possible results, and a "type" specifying which it was, and a big enum of types), and a ConvertStringToVariant() function.
This is too much of a guessing game,
and if I have 10 possible types, for
any given string
If you're concerned about this, you need a lexical analyzer, such as flex or Boost::Spirit.
It will still be a guessing game, but a more "informed" guessing one.